|
Introduction
Some time ago I had to write a C# application that was able to convert documents into various formats. The hardest part was to find a way to create PDF files without the use of any third party products. Here is a solution.
Background
The source you see is out of a larger conversion-application. It is a "stand alone" projects, for educational use and describes a possible way of converting documents. I took me a lot of work to figure this out, so please don't copy the code; drop me a line if you wish to use a part of it. My comments are written in German, I had no time to build a proper release, sorry for that.
Using the code
The code listed below describes the main part of the program. I will give you a brief look, at the idea behind. Crystal reports (the .NET reporting system) is able to create PDF files. The only problem is it can only create PDF files out of a database. It requires an ole objects in the Database. But if you have a (very) close look at CR (IDA :-) ) you will find out that it is able to process, bmp, emf, and wmf. So we only have to insert this kind of data in to a table (as blob) and hand it over to CR. Emf can be created by using PowerPoint, PPT can read html, WinWord can create html. The only problem left is the organization of our pages, we have to split the document manually, I did this by using a Richtextbox.
Now that we know the Way we can convert a rtf into a PDF:
- We load the rtf into a richtextbox.
- We split in into parts. every part is loaded into WinWord, saved as html, the WinWord header is being destroyed, the html page is loaded into PowerPoint and saved as emf. The emf file is written into a Access database as a blob object.
- Crystal Reports gets a rpt "template" the database the report is being created and saved as PDF.
I tried to use Ole32 functions, but I didn't find a way to accomplish this, if you know a way in C#.NET please let me know. private static void DoRTF2ALL(
CrystalDecisions.Shared.ExportFormatType outTp)
{
int lastsplit = 0;
int nextsplit = 0;
int pageheight= 650;
int pcount= 1;
Point xx;
object Unknown =Type.Missing;
Word.Application newApp;
PowerPoint.Application app;
PowerPoint.Presentation ppp;
string[] TempEnt;
RichTextBox rtf = new RichTextBox();
rtf.Height=25000;
rtf.Width=4048;
rtf.LoadFile(scrfile, RichTextBoxStreamType.RichText);
nCoreHlp.EmptyDB(WorkDir + "\\" + Database);
while ((lastsplit+1)<rtf.Text.Length) ////start page split
{
// die ersten paar seiten wegschneiden
rtf.SelectionStart = 0;
rtf.SelectionLength =lastsplit;
rtf.Cut();
for (int r=0;r<=rtf.Text.Length;r++) ////parse through whole text
{
xx = rtf.GetPositionFromCharIndex(r);
nextsplit = rtf.Text.Length;
if (int.Parse(xx.Y.ToString()) > pageheight)
{nextsplit=r-1;r=rtf.Text.Length;}
}
lastsplit=lastsplit+nextsplit;// ende wegschneiden
rtf.SelectionStart = nextsplit;
rtf.SelectionLength =rtf.Text.Length-nextsplit;
rtf.Cut();
rtf.SaveFile(WorkDir + \\temp.rtf,
RichTextBoxStreamType.RichText);
//////////////////////////////////////////// insert db
newApp = new Word.Application();
newApp.Visible = false;
object Source=WorkDir + "\\temp.rtf";
object Target=WorkDir + "\\temp.html";
newApp.Documents.Open(ref Source,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,ref Unknown);
object format = Word.WdSaveFormat.wdFormatHTML;// kein XML, nutzen?
newApp.ActiveDocument.SaveAs(ref Target,ref format,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown,ref Unknown,
ref Unknown,ref Unknown);
newApp.Quit(ref Unknown,ref Unknown,ref Unknown);
//kill word head
StreamReader sr;
bool not=true;
while (not)
{
try
{
sr = new StreamReader(WorkDir + "\\temp.html");
not=false;
StreamWriter sw = new StreamWriter(WorkDir + "\\temp.txt");
String line;
while ((line = sr.ReadLine()) != null)
{
if (line.CompareTo(
"<meta name=ProgId content=FrontPage.Editor.Document>")!=0)
sw.WriteLine(line); else
//line.Replace //Marina
sw.WriteLine("<meta name=ProgId content=Word.Documens>");
}
sr.Close();
sw.Flush();
sw.Close();
}
catch (Exception e){e=e;}
}// kill word head end
File.Delete(WorkDir + "\\temp.html");
File.Move(WorkDir + "\\temp.txt", WorkDir + "\\temp.html");
//File.Delete(WorkDir + "\\temp.txt");
app = new PowerPoint.Application();
ppp = app.Presentations.Open(WorkDir + "\\temp.html",
/*Microsoft.Office.Core.MsoTriState.msoCTrue*/0,
/*Microsoft.Office.Core.MsoTriState.msoTrue*/0,
Microsoft.Office.Core.MsoTriState.msoFalse);//visible? immer no
ppp.SaveAs(WorkDir + "\\temp",
PowerPoint.PpSaveAsFileType.ppSaveAsEMF,
Microsoft.Office.Core.MsoTriState.msoFalse);
app.Quit();
//output fangen
TempEnt = Directory.GetFiles(WorkDir + "\\temp\\", "*.emf");
nCoreHlp.InsertDB1(WorkDir + "\\" + Database,TempEnt[0],pcount);
pcount++;
Console.WriteLine("1 Page Converted");// debug
//Console.ReadLine();
////////////////////////////////////////////////// insert db off
rtf.LoadFile(scrfile, RichTextBoxStreamType.RichText);
}//////////// page split done
///Create PDF
ReportDocument doc = new ReportDocument();
doc.Load(WorkDir + "\\DtoD.rpt");
doc.Database.Tables[0].Location = (WorkDir + "\\DtoD.mdb");
doc.ExportOptions.ExportFormatType = outTp;
doc.ExportOptions.ExportDestinationType =
ExportDestinationType.DiskFile;
//DiskFileDestinationOptions
DiskFileDestinationOptions diskOpts = new DiskFileDestinationOptions();
diskOpts.DiskFileName = dstfile;
doc.ExportOptions.DestinationOptions = diskOpts;
doc.Export();
doc.Close();
Directory.Delete(WorkDir + "\\temp\\", true);
File.Delete(WorkDir + "\\temp.rtf");
File.Delete(WorkDir + "\\temp.html");
}
Points of Interest
MS-Office 2000 or < has to be installed. I included the Word & PowerPoint interfaces in the project , the C# dlls are only in the bin/debug directory, because of the file size.
History
- 1st version of the demo project.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 25 of 35 (Total in Forum: 35) (Refresh) | FirstPrevNext |
|
|
 |
|
|
using System; using System.IO; using System.Collections.Generic; using System.Text; using PortableOpenOffice;
namespace ConvertPdfClass { class Program { static string sLastError = ""; static void Main(string[] args) { ConvertWordToPdf(@"c:\tmp\test.doc", @"c:\tmp\test.pdf",true); } static bool ConvertWordToPdf(string inputFilename,string outputFilename,bool bShow) { object oFile = inputFilename; object oFalse = false; object oTrue = true; object oMissing = System.Type.Missing; object pageBreak = Microsoft.Office.Interop.Word.WdBreakType.wdPageBreak; object outputFile = outputFilename; string tempFile = System.IO.Path.ChangeExtension(outputFilename,".ps"); object oTempFile = tempFile; if (File.Exists(tempFile) == true) { try { File.Delete(tempFile); } catch (Exception ex) { sLastError = ex.Message; return false; } } if (File.Exists(outputFilename) == true) { try { File.Delete(outputFilename); } catch (Exception ex) { sLastError = ex.Message; return false; } } // Create a new Word application Microsoft.Office.Interop.Word._Application wordApplication = new Microsoft.Office.Interop.Word.Application();
try { // Create a new file based on our template Microsoft.Office.Interop.Word._Document wordDocument = wordApplication.Documents.OpenOld(ref oFile,ref oFalse,ref oTrue,ref oFalse,ref oMissing,ref oMissing,ref oMissing,ref oMissing,ref oMissing,ref oMissing); // Make a Word selection object. Microsoft.Office.Interop.Word.Selection selection = wordApplication.Selection; object oRange = Microsoft.Office.Interop.Word.WdPrintOutRange.wdPrintAllDocument; object oItem = Microsoft.Office.Interop.Word.WdPrintOutItem.wdPrintDocumentContent; object oPageType = Microsoft.Office.Interop.Word.WdPrintOutPages.wdPrintAllPages; object oCopy = 1; wordDocument.PrintOut(ref oTrue,ref oFalse,ref oRange,ref oTempFile, ref oMissing,ref oMissing,ref oItem,ref oCopy,ref oMissing ,ref oPageType,ref oTrue,ref oMissing,ref oMissing,ref oMissing,ref oMissing,ref oMissing,ref oMissing,ref oMissing ); wordDocument.Close(ref oFalse, ref oMissing, ref oMissing); wordDocument = null; if (File.Exists(tempFile) == false) { sLastError = "PostScript fiel error!"; return false; } ACRODISTXLib.PdfDistillerClass thDist = new ACRODISTXLib.PdfDistillerClass(); thDist.FileToPDF(tempFile, outputFilename, "");
return true;
} catch (Exception ex) { sLastError = ex.Message; return false; } finally { // Finally, Close our Word application wordApplication.Quit(ref oMissing, ref oMissing, ref oMissing); } }
} }
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
its not working. (ps) file is creating but while creating pdf. Its generating error. kindly, answer the query.
Ahmed ahmedsuria@yahoo.com
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Bit of a Rube Goldberg machine, but I nonetheless learned that Crystal Reports can create PDF and, with a little effort, you can get it to export an RTF document to PDF. Interesting factoids nonetheless.
I think there is a bit of a culture divide reading through the comments. I believe the "thumbs downers" tend to favor solutions with clean but detailed APIs and the "thumbs uppers" tend to not want to delve into the details of how PDF works and just want to get the job done. In other words, the former are looking for a library, and the latter are looking for a tool.
My own "Rube Goldberg" PDF conversion tool using Open Office and GPL Ghostscript is posted here: http://www.codeproject.com/KB/java/PDFCM.aspx[^]. I only mention it because at it least does not require third party tools coming from Microsoft.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hello! Sorry, but you did not mention anything about license. So, can I use it in commercial project?
Alex KraS
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
This is the best hacked together solution ever invented. Someone buy him a drink (not me though, i'm too busy building my mp3 to wmf to raw to wav to css to water-feature converter)
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Sorry to flame this here, but pretty weak dude, wtf??? rtf to html to wmf, then import to access as blob, then use crystal? That is the single most ridiculous thing I have ever seen - and you people complaining about errors are even worse for trying to implement it. And yes you are using 3rd party libraries if you are using M$-office and crystal to spin a pdf, probably at least 10 dlls involved - talk about bloated. You could have cut all that cr@p out and used a simple open source library like itextsharp - copy and paste ONE FREAKING DLL and it works.
itextsharp - ported from java based itext, is the way to go if you want to implement exporting data to a pdf from .net based apps with out all the rigamorole of crystal. I've done crystal, it sucks, and I refuse to use it anymore- use active reports instead.
itextSharp is available for download for free from: http://sourceforge.net/projects/itextsharp/[^]
Computers let you make more mistakes than any other invention in history. With the possible exception of handguns and Tequila
|
| Sign In·View Thread·PermaLink | 4.42/5 (6 votes) |
|
|
|
 |
|
|
Unfair. He is telling us about a tool that can do the PDF conversion in a special circumstance, he is not presenting a detailed API and library. It may lack in scope, but he makes up for it in ingenuity. (Granted he whiffed on the question of "third party products"...)
By the way, here's the opening from the Code Project http://www.codeproject.com/KB/graphics/iTextSharpTutorial.aspx[^] "iTextSharp" tutorial :
There are several ways to create PDFs. The hardest of them all is perhaps to create it on your own using C#. However, if you want to learn how to do so, you have to climb a steep learning curve. You can either read the 1300+ page specification document available free from Adobe's PDF Technology Center or use an open source library called iTextSharp. iTextSharp eases the learning curve a fair amount. But learning to use iTextSharp is itself non-trivial. The people behind iTextSharp have done a very nice job of putting together a set of tutorials. If you get through the tutorials, creating a PDF becomes somewhat easier. The tutorials, however, are based on .NET 1.x and cannot be used "out of the box" with .NET 2.0 without a fair amount of code rework.
So do you go for the Rube Goldberg machine that gets it done today? Or the fancy blue ribbon solution that requires a stable of programmers to explore and implement? Don't scare people away from the former just because you're inclined to the latter.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I found an amazing html to pdf converter library for .net at http://www.dotnet-reporting.com , is the same. It has full support for HTML tags and CSS and I created a PDF report from a ASP.NET page in a few minutes. That would have taken ages with other reporting tools.
Here are a few lines of code that I used to create the report
PdfConverter pdfConverter = new PdfConverter(); pdfConverter.PdfDocumentOptions.PdfPageSize = PdfPageSize.A4; pdfConverter.PdfDocumentOptions.PdfPageOrientation = PDFPageOrientation.Portrait; pdfConverter.PdfDocumentOptions.PdfCompressionLevel = PdfCompressionLevel.Normal; pdfConverter.PdfDocumentOptions.GenerateSelectablePdf = true; pdfConverter.PdfDocumentOptions.ShowFooter = false; pdfConverter.PdfDocumentOptions.ShowHeader = false; pdfConverter.LicenseFilePath = Server.MapPath(@"~/Bin" ; byte[] downloadBytes = pdfConverter.GetPdfFromUrlBytes(MyURL);
There are other interesting PDF tools there like PDF MErge, PDF Split, RTF to PDF Converter, PDF Security tools .
Regards, Florin
html to pdf converter library for .net
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
 |
|
|
HI.....
Can u help me with conversion code that had been load in RTF must be convert in to PDF, DOC, PPT.
|
| Sign In·View Thread·PermaLink | 1.71/5 (7 votes) |
|
|
|
 |
|
|
I am with an error in object ReportDocument.Export();
This is message:
An unhandled exception of type 'CrystalDecisions.CrystalReports.Engine.LogOnException' occurred in crystaldecisions.crystalreports.engine.dll
Additional information: Logon failed.
Anybody Can help me?
...
|
| Sign In·View Thread·PermaLink | 1.67/5 (3 votes) |
|
|
|
 |
|
|
 |
|
|
 |
|
|
i am doing some research by converting doc to pdf, using cplusplus. unfortunately there are only codes using C# or .net ,etc.I have no idea of that languages,who can help me ? thx a lot!
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
hi this is sridhar in this following code doc.Load(WorkDir + "\\DtoD.rpt"); how is DtoD.rpt file is created
sridhar
|
| Sign In·View Thread·PermaLink | 2.25/5 (5 votes) |
|
|
|
 |
|
|
i got an error
No overload for method 'Open' takes '15' arguments line num (472) & line num (543)
|
| Sign In·View Thread·PermaLink | 3.50/5 (2 votes) |
|
|
|
 |
|
|
 |
|
|
What exactly do you have to do to run the demmo code? Do I need to create an access database and crystal report for it to use?
|
| Sign In·View Thread·PermaLink | 1.40/5 (5 votes) |
|
|
|
 |
|
|
Hi Stefan!
I am intrigued by the topic and would like to use (with your permission) a part of your implementation that would do a conversion of DOC and PPT to a BMP or JPEG. Please let me know if you can provide some help.
Thanks much.
Best regards, Vishal K. Mehta
|
| Sign In·View Thread·PermaLink | 2.00/5 (2 votes) |
|
|
|
 |
|
|
 |
|
|
I'm also intrested, I think, you only have to remove comment on the line that make the fileselection.
I haven't tested yet because I'm missing the access db....
|
| Sign In·View Thread·PermaLink | 1.50/5 (2 votes) |
|
|
|
 |
|
|
(1)The application shows error when running this line of code: ppp=app.Presentations.OpenWorkDir+"\\temp.html",0,0,Microsoft.Office.Core.MsoTriState.msoFalse);
Error message: Object not set to an instance.
(2) What must I do if I want to use the application on one of my web sites?
Warm regards.
Zu Luong
|
| Sign In·View Thread·PermaLink | 1.00/5 (2 votes) |
|
|
|
 |
|
|
I don't want to knock your project, I'm not saying this won't work and it obviously took quite a bit of research and experimentation; but aren't Word, Access, Powerpoint (MS Office) and Crystal Reports considered 'third party products'?
|
| Sign In·View Thread·PermaLink | 1.67/5 (3 votes) |
|
|
|
 |
|
|
I have a solution that uses free "third party products" and works with the print spooler so ANY application that can print can output a pdf... I haven't implemented it in code, so it really isn't worth posting a CP article on it, but I might try to make my process available somewhere.
In the meantime, if anyone is interested, contact me directly and I can provide a PDF (of course) that details everything. It isn't a perfect solution, but it definately works.
|
| Sign In·View Thread·PermaLink | 2.25/5 (4 votes) |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|