65.9K
CodeProject is changing. Read more.
Home

Document conversion with OpenOffice

starIconstarIconstarIcon
emptyStarIcon
starIcon
emptyStarIcon

3.27/5 (10 votes)

Aug 28, 2005

2 min read

viewsIcon

120019

downloadIcon

1189

Convert documents to HTML and PDF using OpenOffice.

Sample Image - oo2html.png

Summary

This is a document conversion application. It converts docs from OpenOffice format to HTML and PDF. It could easily be modified to open any OpenOffice supported document. See the OpenOffice website for a list of supported formats.

Introduction

As project manager for a dev team, I needed a way to keep all design docs current on the intranet. We save all our project documentation in the CM system. So, I built a little app that scans the CM directory for OpenOffice files (either doc or spreadsheet), converts them to HTML and PDF, then copies them to the web server with an updated index.html file.

The code is quite simplistic, but solves a problem that people might run into regularly. So, I wanted to set it free.

This example works with both OpenOffice 1.1.4 and 1.9 beta.

Using the code

The code is quite easy, once you figure out all the tricks. I am just using the script interface for OpenOffice to open a file, then save in another format.

I noticed that there was some similar code for VB, so I figured it should probably work for VB.NET, and it did! I have no idea if C# has the same CreateObject functionality.

Dim objServiceManager As Object
Try
    objServiceManager = CreateObject("com.sun.star.ServiceManager")
Catch ex As Exception
    Throw New Exception("OpenOffice is not installed" & vbCrLf & ex.ToString())
End Try

The ServiceManager is used to a create a desktop. The desktop is used to open a document. Then the document is saved with a filter, which is similar to using the "Save as" feature in OpenOffice. I specify "HTML (StarWriter)" to save OpenOffice format to HTML. Pretty easy, huh?

Per the introduction, my app also copies the docs to the web server. So, it requires "from" and "to" arguments. The example in the picture above would be:

oo2html "c:\test\" "C:\Program Files (x86)\xampp\htdocs\test"

Points of Interest

As mentioned above, this app could be modified to do any conversion that is possible with OpenOffice. For example, you could open all *.doc files instead of *.sxw, then output them to HTML. Or, it could open HTML files and output to doc. You get the picture.

History

My original version was more robust to solve my particular problem. It updated the CM Tree, and only copied files with a certain naming convention, among other things. I have trimmed it down considerably, in hopes of making it more straightforward.