PDF Viewer Control Without Acrobat Reader Installed

Ron Schuler

4.95/5 (143 votes)

6 Oct 2009CPOL4 min read

1.5M

74K

PDF document viewer control that does not require any Acrobat product to be installed

Introduction

This article discusses how to create a .NET PDF Viewer control that is not dependent on Acrobat software being installed.

Fundamental Concepts

The basic steps that need to take place in order to view a PDF document:

Get a page count of the PDF document that needs to be viewed to define your page number boundaries (iTextSharp or PDFLibNET)
Convert the PDF document (specific page on demand) to a raster image format (GhostScript API or PDFLibNET)
--(Deprecated) Extract only the current frame to be viewed from the raster image (FreeImage.Net)
Convert the current frame to be viewed into a System.Image
Display the current frame in a PictureBox control

Several utility classes were created or added from others which expose functionality needed from the various helper libraries.

GhostScriptLib.vb (contains methods to convert PDF to TIFF for Viewing and Printing)
AFPDFLibUtil.vb (contains methods to convert PDF to System.Image for Viewing and Printing as well as methods to create a Bookmark TreeView)
iTextSharpUtil.vb (contains methods for getting PDF page count, converting images to searchable PDF and for extracting PDF bookmarks into TreeNodes)
PrinterUtil.vb (contains methods for sending images to printers)
ImageUtil.vb (contains methods for image manipulation such as resize, rotation, conversion, etc.)
TesseractOCR.vb (contains methods for Optical Character Recognition from images)
PDFViewer.vb (contains the Viewer user control)

I was tempted to move every function over to PDFLibNet (XPDF) which is faster, but after a lot of testing, I decided to use Ghostscript and PDFLibNET. Ghostscript is used for printing, "PDF to image" conversion, and as a secondary renderer in case of XPDF incompatibility. PDFLibNET is used for quick PDF to screen rendering, searching, and bookmarks.

Using the Code

This project consists of 7 DLLs that must all be in the same directory:

FreeImage.dll
FreeImageNET.dll
gsdll32.dll
itextsharp.dll
PDFLibNET.dll
tessnet2_32.dll
PDFView.dll

Due to file size restrictions, I could not include the Ghostscript 8.64 DLL (gsdll32.dll) in the source code. Please download the Win32 Ghostscript 8.64 package from sourceforge.net and place the file "gsdll32.dll" into the \PDFView\lib directory where the other DLLs already exist.

To place a PDF control on form:

VB.NET

Dim PDFFileName As String = "MyPDF.pdf"

Dim PDFViewer As New PDFView.PDFViewer
' Specify whether you want to see bookmarks in the control
' Bookmarks are enabled by default
' PDFViewer.AllowBookmarks = False 'Disable bookmarks

' Get the page count of the PDF document if you want to
' conditionally set properties of the PDFViewer control
' Dim PageCount As Integer = PDFViewer.PageCount(PDFFileName)

' To use Ghostscript, UseXPDF = False
' Ghostscript is slower, but is more compatible and has higher quality rendering
' To use XPDF, UseXPDF = True
' XPDF is quite a bit faster than Ghostscript since there is no file i/o involved
' PdfViewer1.UseXPDF = False 'Disables use of XPDF and associated features

' PDFViewer displays the file as soon as the FileName property is set
' File can be a PDF or a TIFF
PDFViewer.FileName = OpenFileDialog1.FileName

PDFViewer.Dock = DockStyle.Fill 'Autosize the viewer control

Me.Controls.Add(PDFViewer)

The essential part of this solution is extracting the current frame to be viewed from a multi-frame (or single frame) image. At first I used System.Drawing to implement it. I found this to be slower than other C++ solutions that use DIBs (Device Independent Bitmaps) to perform graphic conversions.

VB.NET

Public Shared Function GetFrameFromTiff_
	(ByVal Filename As String, ByVal FrameNumber As Integer) As Image
    Dim fs As FileStream = File.Open(Filename, FileMode.Open, FileAccess.Read)
    Dim bm As System.Drawing.Bitmap = _
	CType(System.Drawing.Bitmap.FromStream(fs), System.Drawing.Bitmap)
    bm.SelectActiveFrame(FrameDimension.Page, FrameNumber)
    Dim temp As New System.Drawing.Bitmap(bm.Width, bm.Height)
    Dim g As Graphics = Graphics.FromImage(temp)
    g.InterpolationMode = InterpolationMode.NearestNeighbor
    g.DrawImage(bm, 0, 0, bm.Width, bm.Height)
    g.Dispose()
    GetFrameFromTiff = temp
    fs.Close()
End Function

I then tried implementing FreeImage with a .NET wrapper which gave it a little speed boost. FreeImage also has a ton of image conversion functions which may come in handy if you wanted to extend this into an editor.

VB.NET

Public Shared Function GetFrameFromTiff2_
	(ByVal Filename As String, ByVal FrameNumber As Integer) As Image
    Dim dib As FIMULTIBITMAP = New FIMULTIBITMAP()
    dib = FreeImage.OpenMultiBitmapEx(Filename)
    Dim page As FIBITMAP = New FIBITMAP()
    page = FreeImage.LockPage(dib, FrameNumber)
    GetFrameFromTiff2 = FreeImage.GetBitmap(page)
    page.SetNull()
    FreeImage.CloseMultiBitmapEx(dib)
End Function

I ended up implementing PDFLibNET which gave it a substantial speed boost since the amount of File I/O operations were reduced. Another streamlined routine for extracting one page from a PDF was added to the Ghostscript utility class as well.

AFPDFLibUtil.vb

VB.NET

Public Shared Sub DrawImageFromPDF(ByRef pdfDoc As AFPDFLibNET.AFPDFDoc,
    ByVal PageNumber As Integer, ByRef oPictureBox As PictureBox)
        If pdfDoc IsNot Nothing Then
            pdfDoc.CurrentPage = PageNumber
            pdfDoc.CurrentX = 0
            pdfDoc.CurrentY = 0
            pdfDoc.RenderDPI = RENDER_DPI
            pdfDoc.RenderPage(oPictureBox.Handle.ToInt32())
            oPictureBox.Image = Render(pdfDoc)
        End If
    End Sub

    Public Shared Function Render(ByRef pdfDoc As AFPDFLibNET.AFPDFDoc) As Bitmap
        If pdfDoc IsNot Nothing Then
            Dim backbuffer As New Bitmap(pdfDoc.PageWidth, pdfDoc.PageHeight)
            Dim g As Graphics = Graphics.FromImage(backbuffer)
            Using g
                Dim lhdc As Integer = g.GetHdc().ToInt32()
                pdfDoc.RenderHDC(lhdc)
                g.ReleaseHdc()
            End Using
            g.Dispose()
            Return backbuffer
        End If
        Return Nothing
    End Function

GhostScriptLib.vb

Public Shared Function GetPageFromPDF(ByVal filename As String,
    ByVal PageNumber As Integer, Optional ByVal ToPrinter As Boolean = False) As Image
            Dim converter As New ConvertPDF.PDFConvert
            Dim Converted As Boolean = False
            converter.RenderingThreads = Environment.ProcessorCount
            converter.OutputToMultipleFile = False
            If PageNumber > 0 Then
                converter.FirstPageToConvert = PageNumber
                converter.LastPageToConvert = PageNumber
            Else
                GetPageFromPDF = Nothing
                Exit Function
            End If
            converter.FitPage = False
            converter.JPEGQuality = 70
            If ToPrinter = True Then 'Settings for decent print quality
                converter.TextAlphaBit = -1
                converter.GraphicsAlphaBit = -1
                converter.ResolutionX = PRINT_DPI
                converter.ResolutionY = PRINT_DPI
            Else 'Settings for screen resolution
                converter.TextAlphaBit = 4
                converter.GraphicsAlphaBit = 4
                converter.ResolutionX = VIEW_DPI
                converter.ResolutionY = VIEW_DPI
            End If
            converter.OutputFormat = COLOR_PNG_RGB
            Dim input As System.IO.FileInfo = New FileInfo(filename)
            Dim output As String = System.IO.Path.GetTempPath & Now.Ticks & ".png"

            Converted = converter.Convert(input.FullName, output)
            If Converted Then
                GetPageFromPDF = New Bitmap(output)
                ImageUtil.DeleteFile(output)
            Else
                GetPageFromPDF = Nothing
            End If
        End Function

In the PDFViewer code, a page number is specified and:

The page is loaded from the PDF file and converted to a System.Image object.
The PictureBox is updated with the image.

VB.NET

Private Function ShowImageFromFile(ByVal sFileName As String,
    ByVal iFrameNumber As Integer, ByRef oPictureBox As PictureBox,
    Optional ByVal XPDFDPI As Integer = 0) As Image
        oPictureBox.Invalidate()
        If mUseXPDF Then 'Use AFPDFLib (XPDF)
            If ImageUtil.IsPDF(sFileName) Then
                If XPDFDPI > 0 Then
                    AFPDFLibUtil.DrawImageFromPDF(mPDFDoc, iFrameNumber + 1,
                    oPictureBox, XPDFDPI)
                Else
                    AFPDFLibUtil.DrawImageFromPDF(mPDFDoc, iFrameNumber + 1, oPictureBox)
                End If
            End If
        Else 'Use Ghostscript if PDF or use System.Drawing if TIFF
            If ImageUtil.IsPDF(sFileName) Then 'convert one frame to a tiff for viewing
                oPictureBox.Image = ConvertPDF.PDFConvert.GetPageFromPDF(sFileName,
                iFrameNumber + 1)
            ElseIf ImageUtil.IsTiff(sFileName) Then
                oPictureBox.Image = ImageUtil.GetFrameFromTiff(sFileName, iFrameNumber)
            End If
        End If
        oPictureBox.Update()
        Return oPictureBox.Image
    End Function

Points of Interest

This project was made possible due to various open source libraries that others were kind enough to distribute freely. I would like to thank all of the Ghostscript, FreeImage.NET, iTextSharp, TessNet, and AFPDFLib (PDFLibNet) developers for their efforts.

History

19^th June, 2009: 1.0 Initial release
22^nd June, 2009: Updated source code to correctly scale printed pages to the Printable Page Area of the printer that is selected
7^th July, 2009: Updated source code to use AFPDFLib(XPDF) or Ghostscript for PDF rendering
15^th July, 2009: Updated source code to use PDFLibNet(XPDF ver 3.02pl3) and added search/export options
22^ndJuly, 2009: Added "Image to PDF" import, password prompt for encrypted PDF files, fallback rendering to Ghostscript if XPDF fails, latest version of PDFLibNet with various bug fixes applied, and LZW compression for "PDF to TIFF" export
20^thAugust, 2009: Major changes:
- Added the ability to convert images into a searchable PDF (OCR is English only for now)
- Added the ability to export a PDF to an HTML Image Viewer
- Pages are only rendered at the DPI needed to fill the Viewer window (good speed increase)
- Rotated page settings are kept while viewing the document
- Added the ability to convert images into an encrypted PDF
- Changed bookmark tree generation to use recursion
- Multiple bug fixes (see SVN log on the repository)
5^th October, 2009

Fixed problem with incorrect configuration error with PDFLibNet.dll
Removed dependencies on FreeImage

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)