Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# .NET PDF
Guys,
 
Does any one knows good open source\third party .NET library to redact Pdf's for sensitive information.
 
I Google'ed for it but no one is of use. Lots of libraries have below limitation.
1) Can't redact pdf using Regex.
2) After redaction if we convert redacted-pdf to text then it shows sensitive information which is obviously not of use.
 

Thanks in advance Smile | :)
Posted 1-May-13 5:11am
Edited 1-May-13 5:35am
v3
Comments
Kenneth Haugland at 1-May-13 10:35am
   
Have you looked at iText?
ganesh.rit at 1-May-13 11:00am
   
No... I am looking into how we can use itextSharp for pdf redaction... if you have any idea please let me know
Sergey Alexandrovich Kryukov at 1-May-13 10:36am
   
I think by "redact" you mean "edit". Please check with some English dictionary.
The question is really vague.
—SA
CHill60 at 1-May-13 10:40am
   
Sergey Alexandrovich Kryukov at 1-May-13 11:02am
   
I know, I just thought it should be "edit", but Richard explained by another possible meaning below.
As in some languages a word like "redact" is used instead of "edit" (when there is no a word "edit"), I assumed it was the influence of OP's native language.
—SA
Richard MacCutchan at 1-May-13 10:48am
   
Redact is used nowadays to mean obliterating sensitive information in documents that are made public. It was used on a daily basis a couple of years ago when all British MPs expense claims were published in the press.
Sergey Alexandrovich Kryukov at 1-May-13 11:04am
   
Ah, got it. Thank you very much for pointing it out, I did not know. I though it was the influence of a different language, which often happens. I explained why I though so, above...
—SA
ganesh.rit at 1-May-13 10:59am
   
Pdf redaction is facility though which we can redact pdf for sensitive information. Adobe provides some paid tools through which we can redact pdf's in acrobat reader. What I want here is programatically redact pdf's using .net library.
Sergey Alexandrovich Kryukov at 1-May-13 11:05am
   
Ah, I got it, thank you. Please see my comments above explaining why I though it was a mistake...
—SA
sangitasojitra at 26-Mar-14 11:36am
   
Ganesh,
I am also looking for some good redaction tool which redacts sensitive information from pdf and should not present after it is converted to text document.
Do you know if any tool provides same?
your help is appreciated a lot.
Thank you.
Sangita.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

This 3rd party tool definitely allows for regex http://www.gnostice.com/nl_article.asp?id=238&t=PDF_Text_Redaction_Using_PDFOne_NET[^] although it's not cheap
  Permalink  
Comments
ganesh.rit at 1-May-13 11:04am
   
Hi Chill60,
 
Thanks for your response,
I already tried PDFOne... but its not of use... first it doesn't succeed to redact everytime... if however succeeded and if we convert redacted pdf to text it shows redacted text which is obviously not of use
 
Thanks,
Ganesh.
CHill60 at 1-May-13 11:12am
   
Have you tried their support center with the not working every time issue?
On the redacted text showing - A lot will depend on how you are converting the pdf to text - again something for the Gnostice support guys
Sergey Alexandrovich Kryukov at 1-May-13 11:05am
   
My 5.
—SA
CHill60 at 1-May-13 11:06am
   
Thank you!
sangitasojitra at 26-Mar-14 11:34am
   
Ganesh,
I am also looking for some good redaction tool which redacts sensitive information from pdf and should not present after it is converted to text document.
Do you know if any tool provides same?
your help is appreciated a lot.
Thank you.
Sangita.
CHill60 at 26-Mar-14 13:03pm
   
It would rather depend on how you are converting to a text file - whichever tool is doing that would need to be able to recognise the redaction OR do the redaction as part of that process. The Gnostice tool should be able to do that but as I said - it's not cheap http://www.gnostice.com/XtremeDocumentStudio_dot_NET.asp[^]
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 4

The only way I see this working is removing the text entirely and replacing it with black bar images.
 
As you're already found out, just drawing a black box over the text doesn't work as the text is still there, unprotected. All you have to do is convert the PDF to text to get it or open the PDF in some PDF editor and just remove the black boxes.
 
I don't know of any library that does this for you and I don't have any code samples to do this with any library. I think you're in uncharted ground here using free libraries.
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Using iTextSharp I am able to redact area of pdf knowing page and area of raster image pdf.
  Permalink  
Comments
Member 10070308 at 23-Jan-14 7:00am
   
Can you please share the code ? I want exactly the same thing.
ganesh.rit at 24-Jan-14 0:30am
   
I drawn rectangle of black color over raster image pdf, where I know page and area to which i have to apply redaction. you can find code on internet to draw rectangle using iTextSharp.
 
Thanks

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 250
1 Jochen Arndt 155
2 PIEBALDconsult 150
3 DamithSL 125
4 Afzaal Ahmad Zeeshan 120
0 OriginalGriff 5,695
1 DamithSL 4,591
2 Maciej Los 4,012
3 Kornfeld Eliyahu Peter 3,480
4 Sergey Alexandrovich Kryukov 3,190


Advertise | Privacy | Mobile
Web03 | 2.8.141220.1 | Last Updated 26 Mar 2014
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100