Click here to Skip to main content
Click here to Skip to main content

View PDF files in C# using the Xpdf and muPDF library, Print PostScript.

By , 26 Nov 2010
 

AFPDFLib

Introduction 

Xpdf is an Open Source library released under GPL license; they have an ActiveX with commercial license, but some time ago, before I knew about this commercial control, I wrote this wrapper library to render PDF files in C#. 

Background

The basic idea is create a preview of PDF files in C#. After looking at many places in the internet, I found this wonderful library; the only problem is that the library uses XLib, and there is no XLib available for Windows. Fortunately, Xpdf can render the generated PDF into a Win32 DC.

Writting the wrapper

C++\CLI can mix managed and unmanaged code thanks to the IJW technology, so I was thinking that maybe I could to link the xpdf lib to the wrapper. The problem is that xpdf has some classes that are also declared in .Net, the solution was compile a C++ project with a class that includes only the necessary files to do the interop.
This library (AFPDFLib) contains a simple class that works like a proxy between C++ and C++\CLI, keeping the xpdf objects into the unmanaged Heap.
The C# wrapper is linked to AFPDFLib statically, and this only includes:
AFPDFDocInterop.h
OutlineItemInterop.h
SearchResultInterop.h
The classes:
AFPDFDoc -> Implement the methods that needs xpdf.
AFPDFDocInterop -> Write the methods to wrap into C#
PDFWrapper -> Wrapped methods

Marshal Strings:
IntPtr ptr = Marshal::StringToCoTaskMemAnsi(fileName);
char *singleByte= (char*)ptr.ToPointer();
try{
}finally{
     Marshal::FreeCoTaskMem(ptr);
}   

For releasing resources is necessary implement IDisposable:

!PDFWrapper()
{
   _pdfDoc->Dispose();
} 

Using the code  

The file xpdfWin-Interop.sln includes all the necessary files, you can also download the last version from http://www.foolabs.com/xpdf/ and recompile without the files that requires XLib.

The Build Project Order is as follows: freetype,xpdf,AFPDFLib, PDFLibNet. Once compiled PDFLibNet, it can be used in C# code:    

OpenFileDialog dlg = new OpenFileDialog();
dlg.Filter = "Portable Document Format (*.pdf)|*.pdf";
if (dlg.ShowDialog() == DialogResult.OK)
{
    _pdfDoc = new PDFLibNet.PDFWrapper();
    _pdfDoc.LoadPDF(dlg.FileName);
    _pdfDoc.CurrentPage = 1;

   PictureBox pic =new PictureBox();
   pic.Width=800;
   pic.Height=1024;
   _pdfDoc.FitToWidth(pic.Handle);
   pic.Height = _pdfDoc.PageHeight;
   _pdfDoc.RenderPage(pic.Handle);
   
   Bitmap _backbuffer = new Bitmap(_pdfDoc.PageWidth, _pdfDoc.PageHeight);
   using (Graphics g = Graphics.FromImage(_backbuffer))
   {
       _pdfDoc.RenderHDC(g.GetHdc);
       g.ReleaseHdc();
   }
   pic.Image = _backbuffer;
}   

It is necessary create a PictureBox because the class implements only a method that accepts an HWND, because in the first instance, I was trying to implement the scroll into the same control that the PDF is rendered. In the included sample, the scroll is controlled by a Panel container.

Xpdf can export the PDF to a PostScript file. For printing this is the best option if you have a PostScript Printer:       

PSOutputDev *psOut =new PSOutputDev((char *)fileName,m_PDFDoc->getXRef(),m_PDFDoc->getCatalog(),fromPage,toPage,psModePS);
if(psOut->isOk()){
    m_PDFDoc->displayPages(psOut,fromPage,toPage,PRINT_DPI,PRINT_DPI,0,gTrue,globalParams->getPSCrop(),gTrue);
}
delete psOut;  

The file must be sended in RAW format (http://support.microsoft.com/kb/322091)

JPG Export 

For async export: 

 _doc.ExportJpg(filename, 
1,        //From page
1,        //To page
150,      //Resolution in DPI
90,       //Jpg quality 
if you need a sync operation its posible especify a wait time:
 _doc.ExportJpg(filename, 
1,        //From page
1,        //To page
150,      //Resolution in DPI
90,       //Jpg quality
-1);      //Time to wait, -1 to infinite.
 If the file name does not contains a %d token (for the page number), then the procedure replaces .jpg with -page%d.jpg.  PDFWrapper exposes two events ExportJpgProgress and ExportJpgFinished. Both events are called from the exporting Thread, so it is necesary to make a security call using Invoke, check frmExportJpg for a sample.  

History:    

06\July\2009:
  • Full deployed solution.
  • Updated to xpdf 3.0.2 version.
  • FreeType updated to 2.3.1
  • When click in a bookmark and search, the page scroll to the correct position.
  • PostScript implemented. 
  • Now gets the Title, Author.  
08\July\2009
  • Some memory leaks corrected 
  • Prerender next page in new thread 
  • Cache of pages  
  • Mouse Scrolling 
  • Mouse Navigation 
  • Load links from page (LinkURI, LinkGoTo) 
11\July\2009 
  • Using DIB Sections,  fixes the problem with the zoom.
  • Added control PageViewer, now render only the viewable area
  • Open password protected files.
  • Export to txt
  • Export to jpg
12\July\2009
  • Added support for Unicode in Bookmarks, title, subject, keywords...
  • Added support for named destinations
13\July\2009
  • Fixed some bugs.
  • Added support for unicode search
20\July\2009  
  • Multithread jpg export
  • Fixed others bugs
07\Nov\2009 
  • Added MuPDF as second renderer. 
 26\NOV\2010 Know issues:     MuPDF has some problems with transparency, but is faster than xpdf.  A couple of memory leaks.

IMPORTANT:         

MuPDF uses recursion for analyze the tree document, so is necessary increment the Stack Size to at least 4mb to avoid problems with some complex files (editbin for C#, VB.Net exe's). Soon or later the recursion causes an stack overflow if the tree is so big, so while it is fixed that is the most important issue.

To Do:        

- Apply last xpdf patche - Show multiple pages in the viewer. - Improve user interface. - Implement LoadFromStream for MuPDF. There is missing some functionality that can be extracted from xpdf: - Enable selection, image extraction and instant snapshot. - Print in non PostScript printers.  

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Antonio Sandoval
Engineer HidraQuim SA de CV
Mexico Mexico
Member
I'm a Chemical Engineering that loves the programming.
2003 - Graduated from Technical Programmer UNIVA México.
2009 - Graduated from Chemical Engineering Universidad de Guadalajara, Mexico.
Programmer by Hobby since 6 years ago.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionLinks?membertntblow26 Mar '13 - 10:03 
Isn't there any way to make links in pdf clickable?
QuestionHelp!!! I need .net 1.1 versionmemberxiaoheibai18 Nov '12 - 22:15 
Laugh | :laugh: How can i use this dll file in a .net framework 1.1 project.
Plsease buid a .net 1.1 version!!!
QuestionWhere can I get .sln project file??memberPhiru3 Apr '12 - 14:05 
I go to the link listed on this article, but
 
I couldn't find anything like project file.
 
I only just found some folders including only .c files.
 
That's why i don't know how to compile them.
 
Please help me.
 
Additionally, I am a vs2010 user.
 
Thanks.
QuestionIs that easy to do a whole document search and list all results?memberMayaya6520 Feb '12 - 19:18 
Hi Antonio,
 
I am wondering if it is easy to do a whole document search for certain word and list all result in a "result pane"? Can you give me some hints? Thanks!
QuestionIs this Compatible with .Net FrameWork 4.0?memberSankaroo72 Feb '12 - 1:48 
Good Post!
Kindly reply me.
Is this Compatible with .Net FrameWork 4.0?
AnswerRe: Is this Compatible with .Net FrameWork 4.0?memberBarbara Post1 Jun '12 - 1:58 
Hi,
 
I could not make the VS 2008 to VS 2010 converted solution build as is : it couldn't upgrade 2 C++ projets, so hopes are tiny at first glance.
Questioncan I used PDFLibNet.dll in commercial applications on the CPOL license??memberkikimart.japan25 Dec '11 - 17:59 
can I used PDFLibNet.dll in commercial applications on the CPOL license??
QuestionPDFLibNet.dll for VS2003 [modified]memberlianganton6 Dec '11 - 18:38 
Help:Has a "PDFLibNet.dll" for VS2003 to add reference?if had,where is download?

modified 7 Dec '11 - 19:35.

QuestionText Positionmemberdavfeu29 Nov '11 - 21:23 
Is it possible to return the position of extracted textblocks and/or words on a pdfpage?
QuestionHola antonio, sin embargo cuando levanto una version en visual basicmemberGL_Terminator11 Oct '11 - 7:17 
Hola antonio, disculpa qu eno te escriba en ingles, si quieres puedes despues borrar este comentario. Use tu projecto para un proyecto personal mio a principios del 2010 era para una multimedia de una exploracion virtual que cuando te parabas en frente de un cuadro en la escena virtual te salia una lista y cuando la presionabas salia el libro en pdf (si quieres te la comparto para que la veas). Salvo dos o tres cosas pude utilizar tu wrapper sin mayores problemas. El problema es que ahora tengo windows 7 64 bits y la aplicacion no levanta pero cuando clickeo la version tuya de visual basic si levanta. Alguna sugerencia
AnswerRe: Hola antonio, sin embargo cuando levanto una version en visual basicmemberAntonio Sandoval11 Oct '11 - 9:03 
Hola, mi correo es ssjantonio@hotmail.com. En cuanto tengo un poco de tiempo libre podemos trabajar en resolver el problema.
QuestionHelp... System.AccessViolationExceptionmemberGabriel_7524 Aug '11 - 16:26 
Hi, first at all... great job.
 
I'm using to export to jpg a pdf, but i got an exception:
 
using (PDFLibNet.PDFWrapper pdf = new PDFLibNet.PDFWrapper())
{
    if (pdf.LoadPDF(path))
    {
        pdf.ExportJpg(pathJPG + "_1.jpg", 1, 1, 300, 100, -1);
        pdf.ExportJpg(pathJPG + "_2.jpg", 2, 2, 300, 100, -1);
    }
} <--- At this point AccessViolationException is thrown with the following message "Attempted to read or write protected memory ...."
 
Do you have any ideas how i can fix that?
AnswerRe: Help... System.AccessViolationExceptionmemberAntonio Sandoval11 Oct '11 - 9:04 
I will to work in that bug as soon as I have a free time, thank you!
QuestionSearch function crashesmembercricrides3 Aug '11 - 2:38 
Hello
 
I use FindFirst and FindNext to search in my pdf documents.
It works fine on small pdfs but if I open a large pdf and try a search, it crashed after several calls of the findnext.
I tried various solutions like calling findfirst everytime the findnext goes to a new page.
Any idea how to avoid this error?
I use version 1.6.8 of the PDFLIBNET.DLL.
 
Thanks a lot
 
regards
 
christophe
QuestionHow can i get this dll CJK(chinese Japanise Koreal )supported?memberzhuoml2 Aug '11 - 16:47 
especially for the text extract part of this C# PDFLibdll,how can i make the text meaningfull.
Question.net 2.0.memberJosip Habjan19 Jul '11 - 0:11 
is there any chance to get this to work under .net 2.0.?
QuestionHow can I merge the code with the latest mupdf?memberwmjordan4 Jul '11 - 23:46 
I am new to C programming. Is it simple to merge the latest mupdf into this project?
GeneralI have the fix for the muPDF rendering memory leak [modified]memberRon Schuler29 May '11 - 15:57 
No matter what I tried, the _xref object just kept getting bigger and bigger after each new page render.
I found a post complaining about the same behavior in an Android project.
 
It's a one line fix (in two places):
 
mupdfEngine.cpp
 
HBITMAP mupdfEngine::renderBitmap(....
fz_pixmap* mupdfEngine::display(....
 
    fz_error error = pdf_runpagefortarget(_xref, page, dev, ctm);
    pdf_agestore(_xref->store, 3); //Tell pdf_xref object to do some house cleaning

modified on Monday, May 30, 2011 9:32 AM

AnswerRe: I have the fix for the muPDF rendering memory leakmemberAntonio Sandoval30 May '11 - 5:54 
hi Ron, I have been reading your comments.
Yes, I have found the same that you, this is an old BUG of mupdf. after take a look, I can see that each call to pdf_loadpagetree allocates a new block of memory for xref
 
xref->pagerefs = fz_malloc(sizeof(fz_obj*) * xref->pagecap);
xref->pageobjs = fz_malloc(sizeof(fz_obj*) * xref->pagecap);
 
this is a very small block, but maybe can cause another memory leak when opening multiple files.
I have added pdf_freexref in mupdfEngine->LoadFile, like your first fix:
 
if(_xref!=NULL)
{
pdf_freexref(_xref);
_xref = NULL;
}
 
Regards
GeneralRe: I have the fix for the muPDF rendering memory leakmemberRon Schuler30 May '11 - 6:08 
Calling pdf_agestore(_xref->store, 3) at the end of the render routine fixes the leak.
No need to free _xref since _muPDF is deallocated in Dispose().
GeneralRe: I have the fix for the muPDF rendering memory leakmemberRon Schuler30 May '11 - 6:20 
I have tested memory leaks with 10 different pdf documents > 25mb each and > 400 pages in a loop 100 times with 100 pages rendered per file with muPDF and XPDF.
It does not leak.
I have the RenderNotifyFinished event handler disabled as posted earlier.
If you fix the disposal of the RenderNotifyFinished event handler, I am confident that all memory leaks will be fixed.
GeneralRe: I have the fix for the muPDF rendering memory leak [modified]memberPizzamaka32122 Nov '11 - 1:29 
Hi.
 
Have you already applied this fix to the downloadable version (not the .Net 4.0 - I am still on 3.5)? I have a memory leak, and I do not know, whether the fix is already implemented.
BTW: The memoryleak is not present in the version, that this guy uses: link But since that version does not support Rotation I cannot use it...
 
Oh, and great work!

modified 22 Nov '11 - 7:56.

GeneralRenderNotifyFinished handler not being removedmemberRon Schuler25 May '11 - 16:48 
I got a post on my pdfviewer page that is claiming that the handler for RenderNotifyFinished is preventing PdfPage and PdfWrapper from being garbage collected.
Can i put the code to remove the handlers in PDFWrapper.h in the !PDFWrapper() or ~PDFWrapper() methods?
I'm assuming that _evRenderNotifyFinished or _internalRenderFinished is the cause.
What is the syntax to remove/delete these event handlers?
GeneralRe: RenderNotifyFinished handler not being removed [modified]memberRon Schuler28 May '11 - 4:41 
After more examination, i think I see the root cause in PDFWrapper.cpp.
The RenderPage method is adding the RenderNotifyFinished handler regardless of the value of bEnableThread.
I would prefer to remove the handler during PDFWrapper.Dispose but I keep getting protected memory errors (probably due to me not understanding the event model in C++).
Instead, I am changing the other RenderPage methods to pass False for bEnableThread and changing the main RenderPage method to not subscribe to the event if bEnableThread is false.
The memory leak is fixed if bEnableThread is false but is still present if bEnableThread is True.
 
This is how I am modifying PDFWrapper.cpp:
 
	bool PDFWrapper::RenderPage(IntPtr handler, System::Boolean bForce, System::Boolean bEnableThread)
	{
		long hwnd=(long)handler.ToPointer();
		if (bEnableThread) {
			if(this->_internalRenderNotifyFinished==nullptr){		
				_internalRenderNotifyFinished=gcnew RenderNotifyFinishedHandler(this,&PDFWrapper::_RenderNotifyFinished);
				_gchRenderNotifyFinished = GCHandle::Alloc(_internalRenderNotifyFinished);
			}
			_pdfDoc->SetRenderNotifyFinishedHandler(Marshal::GetFunctionPointerForDelegate(_internalRenderNotifyFinished).ToPointer());
		}
		long ret =_pdfDoc->RenderPage(hwnd,bForce,bEnableThread);
		if(ret==10001)
			throw gcnew System::OutOfMemoryException(ret.ToString());
		
		return true;
	}
 
	bool PDFWrapper::RenderPage(IntPtr handler, System::Boolean bForce){
		return RenderPage(handler,bForce,false);
	}
 
	bool PDFWrapper::RenderPage(IntPtr handler)
	{
		return RenderPage(handler,false,false);
	}

modified on Saturday, May 28, 2011 12:25 PM

GeneralRe: RenderNotifyFinished handler not being removedmemberAntonio Sandoval28 May '11 - 7:33 
Hi Ron, this is a BUG, I will to put it in my large list Smile | :)

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 26 Nov 2010
Article Copyright 2009 by Antonio Sandoval
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid