5,660,782 members and growing! (16,184 online)
Email Password   helpLost your password?
Desktop Development » Files and Folders » Utilities     Intermediate License: The Code Project Open License (CPOL)

Generate PDF Using C#

By Predrag Tomasevic

Using OpenOffice to convert different document types to PDF.
C# (C# 1.0, C# 2.0, C# 3.0, C#), Windows, .NET (.NET, .NET 3.5)VS2008, Visual Studio, Dev

Posted: 23 Jul 2008
Updated: 30 Jul 2008
Views: 21,305
Bookmarked: 87 times
Announcements
Loading...



Search    
Advanced Search
Sitemap
23 votes for this Article.
Popularity: 6.11 Rating: 4.49 out of 5
1 vote, 4.3%
1
1 vote, 4.3%
2
1 vote, 4.3%
3
3 votes, 13.0%
4
17 votes, 73.9%
5
Note: This is an unedited contribution. If this article is inappropriate, needs attention or copies someone else's work without reference then please Report This Article

Introduction

I must confess that I’m not a big fan of PDF. Still, it somehow manages to wiggle in almost every project I'm on – clients want to send out documents, Word is bounded to Windows, HTML is lame, PDF it is. Unfortunately, the situation with it and C# haven’t changed much in couple past years - if there were no new, fancy, priced components, I would conclude that it’s almost the same as it was in .NET 1.1 times – it is a pain to create PDFs.

For those of you who have access to components which can convert popular formats to PDF, this article is pretty much useless. But, for those who don’t want or simply can’t shell out over 1000$ for a chance to convert other formats to PDF – I hope that this solution will prove as an attractive alternative.

Idea

During a talk with my friend Toni Ruža (who is primarily a Python developer) about a way to easily convert some WordML reports to PDF, he pointed me to the headless OpenOffice mode. It seems that it has been around for quite some time, but as it is mainly targeted at Java developers, it is no wonder that there were no big fuss about it in C# groups. Still, it promises much – you install OpenOffice, start it in Service mode, send commands over the API, and get to use any feature it provides. More than anything else, my interest was to load any supported format into OpenOffice and then export it as PDF.

Just to note, in this article, I'll talk about creating PDF from other documents, not from scratch. If you are looking for a way to do that, I'm encouraging you to first take a look at my Generating Word Reports / Documents article. Follow it, and you'll easily create WordML files (like the one used here) from a database or XML.

Solution architecture

I was saddened to find out that the headless mode of OpenOffice just minimizes GUI operations, not totally avoiding them. As someone who has a pretty nasty experience with Word.Application.Open() (using interactive applications such as Word by programmatically mimicking user actions), I started thinking on how to isolate OpenOffice and query it independently of the main application process, thus enabling loose coupling and a more stable environment. The result was a Windows Service which wraps the OpenOffice process, taking care of the security context and the usage, while providing the needed functionality over Remoting (am I a service-oriented freak or what? :)).

Here is a diagram presenting the classes used in the process:

Figure 1 – Class diagram

Figure 1 – Class diagram

ConversionToPDF is the main class when it comes to performing useful work. It employs various classes from the unoidl.com.sun namespace to communicate with OpenOffice, and mimics operations such as opening file, exporting it as PDF, etc. It also uses the OfficeController class which is responsible for the lifecycle of OpenOffice’s process – it starts, monitors, and finally kills soffice.exe when not used, to preserve resources.

Receiver is the class that my Windows Service registers for usage over Remoting. It implements the IReceiver interface (needed functionality), and serves as the bridge between the main applications and OpenOffice.

Finally, I’ve created the GenericSender class for those not familiar with Remoting. It provides the Init method that accepts an address on which the Windows Service wrapper listens (by default, it is tcp://localhost:6543/OpenOfficeServiceReceiver) and initializes a proxy receiver (available as a property). From that point forwards, everything is simple as GenericSender.Receiver.ConvertToPDF(...).

How to start everything on my machine?

Let’s do this in a step-by-step fashion:

  1. Download OpenOffice from here and perform the standard installation. I’ve developed and tested a solution using version 2.4 of OO. If you are setting up everything on an x64 machine, be sure to add the OO program directory (by default: c:\Program Files (x86)\OpenOffice.org 2.4\program) to the PATH environment variable as described in this forum post. If you change the environment variable, be sure to restart the machine in order to commit and reload the changes.
  2. Download the source code that accompanies this article and Build Solution using the Release configuration in Visual Studio 2008. When the build is complete, check OpenOfficeService/bin/Release and run svc_inst.bat. After that, you should see the OpenOffice Wrapper Service in the list of services when you go to Control Panel -> Administrative Tools -> Services. Right click on it, select Properties, go to the Log on tab and check Allow service to interact with desktop.
  3. Before you can start the service, you need to tweak the license agreement. Because the wrapper service will run under the LocalSystem account, you need to somehow tell OpenOffice that the LocalSystem user “accepted” the terms of use. To prevent the license agreement dialog from popping up and blocking everything, you need to modify the file at %OOInstallPath%\share\registry\data\org\openoffice\Setup.xcu by finding this part:
    <prop oor:name="ooSetupInstCompleted">
      <value>false</value>
    </prop>
    <prop oor:name="ooSetupShowIntro">
      <value>true</value>
    </prop>

    and replacing it with (note that LicenseAcceptDate must be later than the OpenOffice installation time):

    <prop oor:name="ooSetupInstCompleted" oor:type="xs:boolean">
     <value>true</value>
    </prop>
    <prop oor:name="LicenseAcceptDate" oor:type="xs:string">
     <value>2008-07-22T14:00:00</value>
    </prop>
    <prop oor:name="FirstStartWizardCompleted" oor:type="xs:boolean">
     <value>true</value>
    </prop>

    This step is taken from here, and I would like to thank Mirko Nasato for his great guide.

    Be sure to start any OpenOffice application (Start -> Programs -> OpenOffice.org -> OpenOffice.org Writer, for example) and validate that it loads without any glitches in order to be sure that OO is properly installed and setup.

  4. Verify the service configuration (next chapter), start the OpenOffice Wrapper Service, and use it to convert a document. If you have downloaded the source code, you can right click on Default.aspx (Test Applications -> PDFWeb project) in Solution Explorer and choose View in Browser... Here is a code excerpt that uses the GenericSender from OpenOfficeService.Objects.dll to perform the conversion:
    protected void GiveMePDFButton_Click(object sender, EventArgs e)
    {
        // Initialize Receiver in GenericSender
        OpenOfficeService.Objects.GenericSender.Init(
            "tcp://localhost:6543/OpenOfficeServiceReceiver");
    
        // Translate path and load up file in byte array, convert it
        string source = Server.MapPath("~/SomeWordML.xml");
        byte[] wordML = File.ReadAllBytes(source);
    
        byte[] result = 
          OpenOfficeService.Objects.GenericSender.Receiver.ConvertToPDF(wordML);
    
        // Write response to client
        Response.AddHeader("content-type", "application/pdf");
        Response.AddHeader("Content-Disposition", "attachment; filename=result.pdf");
    
        Response.BinaryWrite(result);
    }

    Figure 2 – Testing page

    Figure 2 – Testing page

Believe it or not, that’s it! You now have a functioning PDF converter which can be queried from C#, by Remoting.

During the wrapper implementation, I thought about multi-threading and (hopefully) made calling the ConvertToPDF thread safe. Conversion requests are queued and processed one by one, so the Open Office Wrapper Service can be used by more than one application and, why not, from multiple machines too (the generic sender for the application running on other machines should then be initialized with tcp://%machineHostingService%:6543/OpenOfficeServiceReceiver).

Configuration

Currently, there are the following settings for the Open Office Wrapper Service:

  • Port – It’s the port on which the service will listen for requests. By default, it is 6543.
  • ProcessName – The name of the OpenOffice process (used when searching the process list to see if OO is alive). When you start OpenOffice in headless mode, it is soffice.bin (instead of soffice.exe).
  • PathToOpenOffice – Self-explanatory, eh? If you have installed OpenOffice on a path other than the default, you should change this setting (the default path is c:\Program Files\OpenOffice.org 2.4\program\soffice.exe; on x64 machines, add (x86) after Program Files).
  • SecondsIdleAllowed – When a conversion request is submitted, OpenOfficeController checks if OO is running in the background, and if not, starts soffice.exe in headless mode. By default, if no new request is made in 60 seconds, the Open Office process will be killed.
  • CheckIntervalInSeconds – The interval in which the service evaluates Open Office usage (bounded to the previous setting). By default, it is 30 seconds.
  • RequestTimeoutInSeconds – The time in which a response is expected from OpenOffice. If the item stays in the queue for too long or OpenOffice gets a too big file for processing, a Timeout Exception will be thrown. The default wait is 30 seconds.

Running in-process?

I would like to mention once again that the Windows Service I wrote is only there to provide a security context and serve as a bridge to OpenOfficeWrapper.dll that implements the main stuff when it comes to communicating with OpenOffice. If you wish, you can directly reference OpenOfficeWrapper.dll and perform PDF conversions in-process, but you must be sure that your application will be run with sufficient security privileges! In my testing, the conversion was successful only if I run the application under an account that belonged to the Administrator group.

Also, you could run into trouble when trying to run OpenOfficeWrapper on x64 versions of Windows. I’ve had tons of trouble trying to get my Web Application to convert PDF by using the OpenOfficeWrapper in process on a Windows 2003 x64 machine. So, if you really do not need to have everything in your application’s process, leave the code that wraps OpenOffice separated and use it through a Windows Service.

Words of warnings and words of thanks

To me, the documentation of OpenOffice is terrible. OK, I could be another C# "quasi-developer" who finds it easier to look at examples than to crawl through bunch of Wiki pages, diagrams, and forum posts just to get a couple lines of code that opens a document. But, for me - after an absolute champion of useless information, unrelated links, and broken searches the MSDN - the OO developer portal is one more example of how you do not want your documentation to be organized. From what I’ve seen, OpenOffice is a great product considering the cost (0$), and it is a shame that I can’t say the same for its documentation.

On the other hand, posts of server users on the OOoForum are really helpful; I would specifically like to thank LarsB, tcedi, and DannyB. Most of the conversion code in ConvertToPDF.cs is taken from LarsB’s Commandline PDF convertor; so, thank you man – I hope you’ll continue to post useful snippets.

Conclusion

With this article, I aimed at a simple goal – to provide an easy-to-follow, free, and versatile solution for converting documents to PDF by using C#. I am aware that there are technically more robust solutions, but I do not know any of them that’s free. If you know – please post it in the comments section along with an impression of this article.

Enjoy! ;)

History

  • July 22nd, 2008 - Initial version of the article.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Predrag Tomasevic


I'm currently finishing my Master thesis and military service. After that I'll be looking for interesting projects to keep me occupied so in case you have one drop a mail to pele [at] beotel [dot] net.
Occupation: President
Company: Philosophers Inc.
Location: Serbia Serbia

Other popular Files and Folders articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 17 of 17 (Total in Forum: 17) (Refresh)FirstPrevNext
GeneralMake sure you use 2.4.1memberjallred13:00 13 Aug '08  
GeneralCan it be used for converting corporate MS Word reports?memberYumashin Alex20:40 5 Aug '08  
AnswerRe: Can it be used for converting corporate MS Word reports?memberPredrag Tomasevic8:18 6 Aug '08  
GeneralGood Articlememberchrisp4178:38 31 Jul '08  
GeneralRe: Good ArticlememberPredrag Tomasevic8:24 6 Aug '08  
GeneralSSSSmemberarsa12:41 29 Jul '08  
GeneralRe: SSSSmemberPredrag Tomasevic21:58 29 Jul '08  
GeneralNice work.memberStumproot22:36 28 Jul '08  
GeneralRe: Nice work.memberPredrag Tomasevic22:00 29 Jul '08  
GeneralHow about iTextmemberHarry Chou7:06 24 Jul '08  
GeneralRe: How about iTextmemberPredrag Tomasevic7:25 24 Jul '08  
GeneralHTMLmemberinetfly1238:11 23 Jul '08  
GeneralRe: HTMLmemberPredrag Tomasevic13:28 23 Jul '08  
GeneralRe: HTMLmemberinetfly1232:57 24 Jul '08  
GeneralGood Job.memberChris Meech6:21 23 Jul '08  
GeneralRe: Good Job.memberPredrag Tomasevic7:35 23 Jul '08  
GeneralRe: Good Job.memberChris Meech8:28 23 Jul '08  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 30 Jul 2008
Editor: Smitha Vijayan
Copyright 2008 by Predrag Tomasevic
Everything else Copyright © CodeProject, 1999-2008
Web12 | Advertise on the Code Project