Click here to Skip to main content
15,891,951 members
Please Sign up or sign in to vote.
2.00/5 (1 vote)
See more:
Im using the Open Source Tool iTextSharp to read a .Pdf file in my Asp.Net MVC3 application which is coded in c#.Net.

Below is my Code.
C#
filePath = Path.Combine(
                    AppDomain.CurrentDomain.BaseDirectory,
                    Path.GetFileName(Infile.FileName));
                    if (System.IO.File.Exists(filePath))
                    {
                        System.IO.File.Delete(filePath);
                    }
                    Infile.SaveAs(filePath);
                    var pdfdoc = new iTextSharp.text.Document();
                    PdfReader reader2 = new PdfReader((string)filePath);
                    string strText = string.Empty;

                    for (int page = 1; page <= reader2.NumberOfPages; page++)
                    {
                        iTextSharp.text.pdf.parser.ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
                        PdfReader reader = new PdfReader((string)filePath);
                        String s = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader, page,its);

                        s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
                        strText = strText + s;
                        reader.Close();
                    }

Im getting the Error on the line
C#
String s = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader, page,its);


The error is Index was outside the bounds of the array.
Regards.
Posted
Comments
mnandikanti 3-Feb-12 18:03pm    
I am having this very same issue, does anyone out there know a solution for this problem? In my case I am able to read some PDF files and for some I get this "Index was outside ....." error.

It looks like there is no such page.

Most probably, the problem is here: instead of
C#
for (int page = 1; page <= reader2.NumberOfPages; page++) {/*...*/}
you need
C#
for (int page = 0; page < reader2.NumberOfPages; page++) {/*...*/}


Remember: in most cases indexing of elements is zero-based.

Next time use the Debugger; you will be able to dig out the problem in no time, with some minimal experience.

—SA
 
Share this answer
 
Comments
Santosh K. Tripathi 29-Jul-15 5:29am    
Hi SA,

could you solve my problem.

http://www.codeproject.com/Questions/1013868/How-Do-I-Increase-Performance-While-Generating-Pdf
same problem i encountered. this occurs if the pdf file contains images. this thread helps solved my problem : http://itextsharp.10939.n7.nabble.com/Possible-bug-in-CMapAwareDocumentFont-ProcessUni2Byte-iTextSharp-5-4-3-td4480.html
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900