Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#4.0 PDF
Hi guys,
 
I would just like to ask if it is possible to search and highlight multiple text in a PDF file using C#.
 
Your advice is very much appreciated. Thank you.
Posted 14-Jun-12 16:26pm
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Here is a sample VB.NET code to highlight the marked text in PDF:
 
Private Sub TextMarkButton1_Click()
    '~~> Change this to your File Name
    filey = "c:\test.pdf"
    Set AcroExchApp = CreateObject("AcroExch.App")
    Set AcroExchAVDoc = CreateObject("AcroExch.AVDoc")
 
    '~~> Open the pdf file
    AcroExchAVDoc.Open filey, ""
 
    '~~> Get the PDDoc associated with the open AVDoc
    Set AcroExchPDDoc = AcroExchAVDoc.GetPDDoc
 
    '~~> Search Text
    sustext = "Release"
 
    '~~> get JavaScript Object
    Set jso = AcroExchPDDoc.GetJSObject
 
    '~~> Show application
    AcroExchApp.Show
 
    '~~> Count of Matches Found
    nCount = 0
 
    If Not jso Is Nothing Then
        '~~> Total No of pages in pdf
        nPages = jso.numpages
        '~~> Loop through pages
        For i = 0 To nPages - 1
            '~~> Get words Count in one Page
            nWords = jso.getPageNumWords(i)
            '~~> Loop Thru words in that Page
            For j = 0 To nWords - 1
                '~~> Get each word
                word = Trim(CStr(jso.getPageNthWord(i, j)))
                If word <> "" Then
                    '~~> Match word with Search text
                    result = StrComp(word, sustext, vbTextCompare)
                    '~~ if found, increment count
                    If result = 0 Then nCount = nCount + 1
                End If
            Next j
        Next i
    End If
 
    '~~> Display Number of matches found
    MsgBox nCount
 
    '~~> Clean Up
    Set jso = Nothing
    Set AcroExchAVDoc = Nothing
    Set AcroExchApp = Nothing
End Sub
  Permalink  
Comments
akosidab at 14-Jun-12 21:39pm
   
Hi,
 
Can you please post the whole code? Coz I don't know the objects you used it causes a lot of errors.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Or you can refer to some classes to manipulate PDF files beyond the Base Class Library of the .NET Framework:
One open Source library: Itextsharp library [^]
 
Commercial one : Spire.PDF[^]
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 495
1 Maciej Los 340
2 Richard MacCutchan 265
3 BillWoodruff 225
4 Mathew Soji 200
0 OriginalGriff 8,804
1 Sergey Alexandrovich Kryukov 7,457
2 DamithSL 5,689
3 Maciej Los 5,279
4 Manas Bhardwaj 4,986


Advertise | Privacy | Mobile
Web01 | 2.8.1411028.1 | Last Updated 14 Jun 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100