Thursday, December 13, 2007
BlogScope is now indexing blogs from Livejournal. This means, BlogScope users will now be able to search and analyze over 300 posts created by LJ users every minute. Frank the Goat is happy.
Labels: blogs, blogscope, livejournal
Saturday, December 08, 2007
Latex to OpenOffice/Word
Recently I had to convert a big Latex document full of maths to Office format. Copy-paste with all the formulas is not an option (more so, given the terrible state of equation editing features in Microsoft Office 2007). After a lot of searching around, I was able to find ways to automate a large chunk of the process by using LyX and Open Office.
- Import Latex document in Lyx.
- Export the document in Lyx to HTML. All the math formulas will be converted to images.
- Open the HTML document in Firefox.
- Start OpenOffice writer, and copy paste all HTML in a new document. Save the document (I used the doc format as I wanted MS Word compatibility). The office document should look similar to the Latex one with all the math equations and symbols as images. The images are referenced as hyperlinks, which means the images will no longer be displayed if you remove the HTML files or copy the file to some other machine. The images should therefore be embedded in the office document.
- To embed images in OpenOffice writer, open Edit Menu->Links. Select all images and click on Break Links. This will remove reference to all externally linked images.
- The document is now ready. Some manual editing will be required as
- Tables and footnotes were missing
- Images were missing (may be because I was using eps files)
- References to sections, figures, and equations were not always present
- Citations were hyperlinks to some random document
I was able to convert a 10 page latex document to a word document in an hour. The formatting was not very good, as all math equations were slightly vertically displaced and were underlined. But at least it was readable.
Just for reference, I am using Lyx version 1.5.2 with Open Office 2.3 on openSUSE 10.3.
Labels: latex, open office, word