I am currently writing a program to search for files relating to different things (e.g ECR's, RMS's, Drawings, etc.) on a system. One part of the program (searching for ECR's) requires me to read around 450 pdf files and search within them for a specified product name, for example "NPS-0243", as the name of the file is of no relation to the products it references. I have successfully achieved this by using a background worker that starts when the program starts to start reading the files and adding them to a string array which can be looked at by the program. The problem is, the array is not completely filled with the text from the files until about 2 minutes after the program is run, and this of course can cause problems for a user who does not want to sit and wait 2 minutes to search for a certain type of file.
My question therefore is, is there any way of speeding this process up? i have tried writing the text from each file to one .txt file then reading and searching for the end of each file's text before adding it to the array, this is, if anything, slower than the previous method. In my opinion, im going to have to live with the 2 minute wait, but then again im not as clever as some of you!
Any help would be greatly appreciated.
Here is the background worker code i am currently using:
public void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
int i = 0;
foreach (string thisfile in filePaths)
if (thisfile.Contains(".MASTER ECR LOG") || thisfile.Contains("Thumbs.db"))
PdfReader reader2 = new PdfReader(thisfile);
string strText = string.Empty;
for (int page = 1; page <= reader2.NumberOfPages; page++)
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
PdfReader reader = new PdfReader(thisfile);
string s = PdfTextExtractor.GetTextFromPage(reader, page, its);
s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
strText = strText + s;
data[i, 1] = strText;
data[i, 2] = thisfile;