Saturday, December 08, 2007

 

Latex to OpenOffice/Word

Recently I had to convert a big Latex document full of maths to Office format. Copy-paste with all the formulas is not an option (more so, given the terrible state of equation editing features in Microsoft Office 2007). After a lot of searching around, I was able to find ways to automate a large chunk of the process by using LyX and Open Office.

  • Import Latex document in Lyx.
  • Export the document in Lyx to HTML. All the math formulas will be converted to images.
  • Open the HTML document in Firefox.
  • Start OpenOffice writer, and copy paste all HTML in a new document. Save the document (I used the doc format as I wanted MS Word compatibility). The office document should look similar to the Latex one with all the math equations and symbols as images. The images are referenced as hyperlinks, which means the images will no longer be displayed if you remove the HTML files or copy the file to some other machine. The images should therefore be embedded in the office document.
  • To embed images in OpenOffice writer, open Edit Menu->Links. Select all images and click on Break Links. This will remove reference to all externally linked images.
  • The document is now ready. Some manual editing will be required as
    • Tables and footnotes were missing
    • Images were missing (may be because I was using eps files)
    • References to sections, figures, and equations were not always present
    • Citations were hyperlinks to some random document

I was able to convert a 10 page latex document to a word document in an hour. The formatting was not very good, as all math equations were slightly vertically displaced and were underlined. But at least it was readable.

Just for reference, I am using Lyx version 1.5.2 with Open Office 2.3 on openSUSE 10.3.

Labels: , ,


Comments:
Hi. Any ideas on how to present math formulae in your blog itself so other people can see the equations?
 
@nileshbansal: thank you for sharing. This is fairly useful for me
@Feanor: look at MathML and Latex plugin for movable type and wordpress.
 
It's always a pain in the (neck) to convert latex to word and the other way around. The equations are not a big issue since you can convert them easily with the new version of MathType (which accepts latex commands), or OpenOffice as you mentioned.

The annoying problem is *inline math* -- if you put MathType objects, the line spacing would be messed up (I've seen this often in papers). The proper way to do manual formatting (subscripts, superscripts, etc). It can get complicated when you have subscript-of-subscript (you have to lower the font and make it smaller), both-subscript-and- super-script" (use "Equation Fields"), and weird symbols (use the MathType font or in the worst case draw it with Visio).

The figures are time-consuming too. I use pstoedit, a plugin of GSView, to convert eps to emf (emf can be edit in Visio). If you use gnuplot for the experiments, set the output format directly to emf and create the figures again with a script.

Citations and figure/table references take time too. The proper way is to use "cross-reference", but if you don't expect to edit the Word document often, simply write the numbers by hand.

Overall, one hour for a 10-page paper is ... very fast!
 
Thank you so much for posting that "break links" thing. I had a huge file with a bunch of linked graphics, and I spent 45 minutes and could not find that trick. Very helpful!
 
There is another way to convert latex to openoffice that I think it is better, to use latex2rtf and the open the rtf file with openwriter

ciao
cla
 
LyX have this conversion automated, as described here: http://ubuntuforums.org/showthread.php?t=453721
 
Well i found the following way to work:

I prepare the source document in OpenOffice and then export to whatever i like - latex export is really fine with the writer2latex tool.

After that i do minor or major manual editing to my latex file depending on what i want. Conversion to doc, html or whatever works too and i am happy.

The odf format is really amazing and in case something annoys you in your document you can even hand-edit the xml file and reassemble things back manually. Gorgeous stuf !!!!!!
It is really cool.
 
This comment has been removed by a blog administrator.
 
This comment has been removed by a blog administrator.
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?