Click here to Skip to main content
Rate this: bad
Please Sign up or sign in to vote.
See more: C#4.0 PDF
Hi guys,

I would just like to ask if it is possible to search and highlight multiple text in a PDF file using C#.

Your advice is very much appreciated. Thank you.
Posted 14-Jun-12 16:26pm
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

Here is a sample VB.NET code to highlight the marked text in PDF:

Private Sub TextMarkButton1_Click()
    '~~> Change this to your File Name
    filey = "c:\test.pdf"
    Set AcroExchApp = CreateObject("AcroExch.App")
    Set AcroExchAVDoc = CreateObject("AcroExch.AVDoc")
    '~~> Open the pdf file
    AcroExchAVDoc.Open filey, ""
    '~~> Get the PDDoc associated with the open AVDoc
    Set AcroExchPDDoc = AcroExchAVDoc.GetPDDoc
    '~~> Search Text
    sustext = "Release"
    '~~> get JavaScript Object
    Set jso = AcroExchPDDoc.GetJSObject
    '~~> Show application
    '~~> Count of Matches Found
    nCount = 0
    If Not jso Is Nothing Then
        '~~> Total No of pages in pdf
        nPages = jso.numpages
        '~~> Loop through pages
        For i = 0 To nPages - 1
            '~~> Get words Count in one Page
            nWords = jso.getPageNumWords(i)
            '~~> Loop Thru words in that Page
            For j = 0 To nWords - 1
                '~~> Get each word
                word = Trim(CStr(jso.getPageNthWord(i, j)))
                If word <> "" Then
                    '~~> Match word with Search text
                    result = StrComp(word, sustext, vbTextCompare)
                    '~~ if found, increment count
                    If result = 0 Then nCount = nCount + 1
                End If
            Next j
        Next i
    End If
    '~~> Display Number of matches found
    MsgBox nCount
    '~~> Clean Up
    Set jso = Nothing
    Set AcroExchAVDoc = Nothing
    Set AcroExchApp = Nothing
End Sub
akosidab at 14-Jun-12 21:39pm

Can you please post the whole code? Coz I don't know the objects you used it causes a lot of errors.
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

Or you can refer to some classes to manipulate PDF files beyond the Base Class Library of the .NET Framework:
One open Source library: Itextsharp library [^]

Commercial one : Spire.PDF[^]

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 Sascha Lefévre 510
1 Sergey Alexandrovich Kryukov 300
2 Abhinav S 230
3 Maciej Los 185
4 Richard Deeming 105
0 Sergey Alexandrovich Kryukov 6,953
1 OriginalGriff 6,311
2 Maciej Los 2,732
3 Peter Leow 2,694
4 Abhinav S 2,652

Advertise | Privacy | Mobile
Web02 | 2.8.150414.1 | Last Updated 14 Jun 2012
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100