Click here to Skip to main content
Click here to Skip to main content

Document conversion with OpenOffice

By , 28 Aug 2005
 

Sample Image - oo2html.png

Summary

This is a document conversion application. It converts docs from OpenOffice format to HTML and PDF. It could easily be modified to open any OpenOffice supported document. See the OpenOffice website for a list of supported formats.

Introduction

As project manager for a dev team, I needed a way to keep all design docs current on the intranet. We save all our project documentation in the CM system. So, I built a little app that scans the CM directory for OpenOffice files (either doc or spreadsheet), converts them to HTML and PDF, then copies them to the web server with an updated index.html file.

The code is quite simplistic, but solves a problem that people might run into regularly. So, I wanted to set it free.

This example works with both OpenOffice 1.1.4 and 1.9 beta.

Using the code

The code is quite easy, once you figure out all the tricks. I am just using the script interface for OpenOffice to open a file, then save in another format.

I noticed that there was some similar code for VB, so I figured it should probably work for VB.NET, and it did! I have no idea if C# has the same CreateObject functionality.

Dim objServiceManager As Object
Try
    objServiceManager = CreateObject("com.sun.star.ServiceManager")
Catch ex As Exception
    Throw New Exception("OpenOffice is not installed" & vbCrLf & ex.ToString())
End Try

The ServiceManager is used to a create a desktop. The desktop is used to open a document. Then the document is saved with a filter, which is similar to using the "Save as" feature in OpenOffice. I specify "HTML (StarWriter)" to save OpenOffice format to HTML. Pretty easy, huh?

Per the introduction, my app also copies the docs to the web server. So, it requires "from" and "to" arguments. The example in the picture above would be:

oo2html "c:\test\" "C:\Program Files (x86)\xampp\htdocs\test"

Points of Interest

As mentioned above, this app could be modified to do any conversion that is possible with OpenOffice. For example, you could open all *.doc files instead of *.sxw, then output them to HTML. Or, it could open HTML files and output to doc. You get the picture.

History

My original version was more robust to solve my particular problem. It updated the CM Tree, and only copied files with a certain naming convention, among other things. I have trimmed it down considerably, in hopes of making it more straightforward.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

duwke
Web Developer
United States United States
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralUpdatememberggraham41220 Jun '08 - 8:11 
Thanks for taking the time - Works for me! - "5"
 
Before reading this I could never get the connection mechanism to work. But I also thought I'd let you know about bootstrapconnector, which is what I ended up using to connect to Open Office, but it has to be version 2.3 or greater. (Obviously way after your article came out Wink | ;-)
 
http://user.services.openoffice.org/en/forum/viewtopic.php?f=44&t=2520[^
Question"Can't Create ActiveX Object" - OpenOfficememberAilton Silva3 Oct '07 - 9:31 
Hello,
 
I was trying to run a this sample application in VISUAL STUDIO .NET and the this line of code bellow occurs.
 
CreateObject("com.sun.star.ServiceManager")
 
It always returns an exception "Can't Create ActiveX Object" after a couple of minutes.
 
I´ve installed the OpenOffice 2.2.3. The same problema has experieced by using OO 1.1.5, too!
 
Someone knows why? Has anybody who can help me?
 
Thanks a lot
 

Ailton Silva
Brazil

 
Ailton Silva
Brazilian IT Professional

QuestionProblem when using open office in web applicationmemberRathiSarov10 Oct '06 - 11:04 
Hi,
 
This is Rathi. I have used open office to convert the document into PDF in my asp.net web application. It works fine when i run the application locally using localhost. But it does not work if i try to access the same application from the same machine using the IP address of the machine.
 
It says Open office is not installed (but i have installed so that only it works locally).
Kindly let me know how to resolve this issue. Earlier response will be of great help since it has to be moved to client and i am getting this error when i deploy the application.
 
please help me out.
 
Thanks,
Rathi
AnswerRe: Problem when using open office in web applicationmemberMember 323966118 Mar '09 - 20:16 
Did u ever solve this ? Please help me. I am also facing a similar problem
AnswerRe: Problem when using open office in web applicationgroupKaNNaN.JC17 Feb '10 - 22:58 
Hi rathi..
 
please send sample code for converting document to pdf formats
Yours,
KaNNaN

-----------------------------------------------------------------
"Success is When Ur Signature Becomes An Autograph"
 

GeneralProblem using similiar code in ASP.NETmemberashutosh99107 Sep '06 - 0:44 
Hi,
 
I was trying to create a similiar application in ASP.NET and the same line of code
 
CreateObject("com.sun.star.ServiceManager")
 
returns an exception "Can't Create ActiveX Object"
 
Can u just help me out.
 
Thanks
 
hi

GeneralRe: Problem using similiar code in ASP.NETmemberMember 323966118 Mar '09 - 20:17 
Did u ever solve this ? Please help me. I was trying this for past one weeks without any success.
GeneralHelp on create OpenOffice filemembermpx2009 Aug '06 - 23:22 
Can you attach some code for creating or editing exist openoffice writer doc. I don't wont to use M$Office COM, so if you can help I will be grateful
GeneralIs it possable not to install full openofficememberlukethepunk4 Aug '06 - 4:12 
Hi,
 
I have used the info in this artical to write an app we use for creating pdf's from rtf files produced from one of our systems, so first of all thanks!
 
I have one question that you might be able to help with. I want to be able to use this on a pc, but i dont want to have to install the whole openoffice applications. We have MS office installed on some client machines and the users get confused if both applications are installed they tend to get a bit confused.
 
Have you any idea if i can just install a couple of openoffice dll's on the client and this will still work, or do i always need the full install.
 
Cheers
Luke

GeneralRe: Is it possable not to install full openofficememberduwke4 Aug '06 - 4:17 
Hi Luke,
 
You are certainly welcome. Glad it helped.
 
Unfortunately, I don't know. There are a few links to the forums in the source code. Those guys will definitely know.
 
If you figure it out, please let me know.
 
Thanks!
 
-Darin
GeneralRe: Is it possable not to install full openofficememberKacee Giger14 Nov '06 - 10:06 
I would also like to use this with the minimal install, so if you have found a solution, please post it here. Otherwise, I'll continue my search and let you know if I find a solution.
GeneralRe: Is it possable not to install full openofficememberlukethepunk16 Nov '06 - 7:13 
Looking round the forums and user groups on openoffice.org I cant see any way of doing it without the full install, so...
 
I am going to write an application that will run as a server side process and monitor a drop folder for the files coming in, and convert them to pdf's. That way there wont be any need to install openoffice on all the client machines.
 
If anyone is interested once I've done it I can post the code up here
 
Cheers
Luke

AnswerRe: Is it possable not to install full openofficemembertoastpooter7 Jun '07 - 6:21 
Try this article. It's still practically a full install but at least it's self contained...
 
http://www.codeproject.com/office/PortableOpenOffice.asp
GeneralConvert from Doc to PDFmemberintranet_man24 Apr '06 - 5:58 
In your article you said it was easy to change conversion methods, however, I cannot seem to get anything to convert, whether it be from .doc to .html to .pdf or from .swf to .html to .pdf
 
I changed the following lines:
 
If fs.Extension.Equals(".doc") Then
SaveParams(1) = OOoPropertyValue(objServiceManager, "FilterName", "MS Word 97")
Document.storeToURL(ConvertToURL(toHtmlFile), SaveParams)
'pdf export
SaveParams(1) = OOoPropertyValue(objServiceManager, "FilterName", "writer_pdf_Export")
Document.storeToURL(ConvertToURL(toPdfFile), SaveParams)
ElseIf fs.Extension.Equals(".sxw") Then
SaveParams(1) = OOoPropertyValue(objServiceManager, "FilterName", "HTML (StarWriter)")
Document.storeToURL(ConvertToURL(toHtmlFile), SaveParams)
' pdf export
SaveParams(1) = OOoPropertyValue(objServiceManager, "FilterName", "writer_pdf_Export")
Document.storeToURL(ConvertToURL(toPdfFile), SaveParams)
End If

 
I've stepped through the program and made sure to change the appropriate environemtn variables but no cigar. Any suggestions?
GeneralRe: Convert from Doc to PDFmemberduwke24 Apr '06 - 6:29 
Does it work for other doc types? Does the original code work with .sxw and .sxc? Is it failing?
GeneralRe: Convert from Doc to PDFmemberintranet_man24 Apr '06 - 6:43 
It createst the index.html file with a command line message of:
 
4/24/2006 11:34:31 AM Docs2Web completed
 
But there is no PDF file for .sxw or .sxc.
 
There are no errors or warnings on compile, nor in the immediates windows.
GeneralRe: Convert from Doc to PDFmemberduwke24 Apr '06 - 6:47 
Did you also change line 52 from
 
For Each fi As System.IO.FileInfo In di.GetFiles("*.sxw")
to
 
For Each fi As System.IO.FileInfo In di.GetFiles("*.doc")
GeneralRe: Convert from Doc to PDFmemberintranet_man24 Apr '06 - 7:12 
Yes, the For statement was changed as well as the first conversion statement at line 118. I found an interesting article here that is bascially using your approach to convert, except its using a macro in open office:
 
http://www.oooforum.org/forum/viewtopic.phtml?t=3772&postdays=0&postorder=asc&start=0
 
BTW, thanks for helping me out with this. This is a good resource...
GeneralMail Mergememberrichardsawyer2 Oct '05 - 1:11 
Hi Duwke,
 
That is very nice code. Thanks. It's got me about half way to what I want to do. I have been trying to convert the code snippets from Java but I don't really understand it at all. I want to do is to automate from VB.NET mail merge in OpenOffice. I just can't work it out and figure you might be able to help me. Any pointers?
 
The best info I have found so far is at:
http://codesnippets.services.openoffice.org/Writer/Writer.MailMerge.snip
and
http://qa.openoffice.org/source/browse/qa/qadevOOo/tests/java/mod/_sw/SwXMailMerge.java?rev=1.8.22.2&content-type=text/vnd.viewcvs-markup
and
http://api.openoffice.org/source/browse/api/offapi/com/sun/star/text/MailMerge.idl?rev=1.6&only_with_tag=cws_src680_swmailmerge&content-type=text/vnd.viewcvs-markup
 

Regards,
Richard.
GeneralAttachmentmemberQuinton Viljoen30 Aug '05 - 19:42 
Hi
 
The link you provided is not working, no file at that location.
GeneralRe: Attachmentmemberduwke30 Aug '05 - 19:53 
Which link?
GeneralRe: AttachmentmemberQuinton Viljoen30 Aug '05 - 20:17 
The source link
GeneralRe: Attachmentmemberdgsconseil31 Aug '05 - 0:39 
no i don't think the link was broken.
I had an internal server error saying the server was too busy.
that's all.
btw the code used to set properties values is from the ooo forum
take a look at http://www.oooforum.org/forum/viewtopic.phtml?t=9815&start=0&postdays=0&postorder=asc&highlight=activex
 
PS : i'm trying to translate this in C#. I'll post the result in this thread soon.
GeneralRe: Attachmentmemberduwke31 Aug '05 - 3:04 
It worked fine for me at a different location. Could you try again?
 
Maybe we are referring to different links. Where is the link pointing to?
GeneralRe: AttachmentmemberQuinton Viljoen31 Aug '05 - 3:17 
The link is http://www.codeproject.com/useritems/oo2html/oo2html.zip
 
I tried it again and it is working now. I was getting the file not found customer error from code project earlier though.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 28 Aug 2005
Article Copyright 2005 by duwke
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid