Click here to Skip to main content
15,867,686 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi All,

I have near about 116222 .pdf files. Among them I need to find out the corrupted files. Can any one please tell me is there any software (free or paid) to get those files which are corrupted or vice versa. I googled a lot but could not find any. All the result showing the fixing software.

Any suggestion will be very much helpful for me.
Posted
Comments
Richard MacCutchan 22-Aug-11 3:49am    
Chances are that the only way to do this is to open every file with a PDF reader, or write your own application to analyse them.
arindamrudra 22-Aug-11 3:53am    
But the number of file is very high, that is the issue.
Richard MacCutchan 22-Aug-11 4:14am    
If these files already exist on your disk then there is nothing you can do without reading each individual file to check it. How else could you tell if it was corrupt?
arindamrudra 22-Aug-11 4:29am    
Yes all the files is there in my disc. Can you please have a look at OriginalGriff's solution (very good tip) and the 2nd and the 3rd link from walterhevedeich those are also of high quality. So I am trying to follow these ways.
Richard MacCutchan 22-Aug-11 4:37am    
Well one thing you may notice from all these links and suggestions is that you will have to read every file; there is no possible way to avoid this.

The problem is in deciding if the file is "corrupted".

If you don't have a SHA hash value for each file, or something similar, then the only way you can tell if the file is corrupted is to try to read it as a PDF file - if you can't then it is either corrupt, or uses a later version of the PDF specification that your reader software.

If you can read them, then they probably aren't corrupt - you would need a human to reader them and ensure they look as they should I suspect - so you could ignore them.

I would process them through a reader and then set up an SHA hash for them, so that any changes can be detected immediately next time.
 
Share this answer
 
Comments
arindamrudra 22-Aug-11 3:55am    
Thanks very good tip. I am going to search for "SHA hash value for each file".
 
Share this answer
 
Comments
arindamrudra 22-Aug-11 4:17am    
I have gone through your first link before your post. But the third link seems very good. The second link may fail due to the number of files. The system may hang.
Hi,
For anyone still seeking a solution to arindamrudra problem should take a look at this free, open source and small program called 'Recursive finder of corrupted PDF files' (download link: http://sourceforge.net/projects/corruptedpdfinder/[^]) which will do just that: find recursively corrupted or password protected PDF files within a folder of a user's selection.

Good luck.
CSilva.
 
Share this answer
 
Comments
arindamrudra 26-May-14 7:47am    
Nice one...
William van Velde 23-Jun-14 7:55am    
Perfect, this is the tool i needed.
 
Share this answer
 
Comments
Kornfeld Eliyahu Peter 12-Jan-15 4:56am    
Have you recognized how old this post is? And already answered!!!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900