Click here to Skip to main content
13,089,246 members (65,776 online)
Rate this:
Please Sign up or sign in to vote.
See more:
Hi All,

I am developing a tool which reads PDF content for comparison. I am using itextsharp to read the content.
Along with the content, I also need to fetch the alignment properties of the PDF, like the alighment of the line, title, header, footer, images etc for comparison.
I also need to get the spacing between two lines.

Please give me some ideas or techniques to do that.

Thanks in advance,
Posted 3-Mar-13 20:04pm
Sandeep Mewara 4-Mar-13 2:28am
Tried anything so far?
kanekhan 4-Mar-13 3:44am
I am using Itextsharp, I have written below code so far,

public string ReadPdfFile()
string strText = string.Empty;
PdfReader reader = new PdfReader(@"\\FilePath");

for (int page = 1; page <= reader.NumberOfPages; page++)
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
String s = PdfTextExtractor.GetTextFromPage(reader, page, its);

s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
strText = strText + s;

catch (Exception ex)
return strText;

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy |
Web01 | 2.8.170813.1 | Last Updated 4 Mar 2013
Copyright © CodeProject, 1999-2017
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100