Click here to Skip to main content
15,567,482 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

I am looking for some advice on how to retrieve data from a very large text file. The file has over 1,000,000 lines of data in it. The data on each line is of fixed length (see below):

GEO   26.509 S30 59  0.000 E154 44  0.000     -4.80      0.13
GEO   26.508 S30 59  0.000 E154 45  0.000     -4.63      0.26
GEO   26.505 S30 59  0.000 E154 46  0.000     -4.58      0.52
GEO   26.500 S30 59  0.000 E154 47  0.000     -4.52      0.78
GEO   26.492 S30 59  0.000 E154 48  0.000     -4.52      0.97
GEO   26.484 S30 59  0.000 E154 49  0.000     -4.52      1.04
GEO   26.476 S30 59  0.000 E154 50  0.000     -4.52      1.17


I'm unclear how to approach this problem. I'm looking for the quickest method to retrieve the data. So far I've tried creating a string list to read all the data from the file, then try to read individual lines of data using the ElementAt method. However this appears to be quite slow if using this method numerous times (ie in a loop).

What I have tried:

Int lineNumber = 2435;

String fileName = "c:\big_file.dat";
String line = String.Empty;

IEnumerable<string> lines = File.ReadLines(fileName);

for (int i = 0; 200 < length; i++)
            {
             line = lines.ElementAt(i);
             // do something else here..
            }



Any ideas?
Posted
Updated 17-Mar-21 21:15pm

Quote:
However this appears to be quite slow if using this method numerous times (ie in a loop).
As long as you need to process each line, you will have this overhead; is there any way to optimize your line processing function ?

to avoid excess memory use: use a StreamReader to read the file one line at a time: then process the line
using (StreamReader sr = new StreamReader(YourFilePath))
{
    string line;
    
    while (! sr.EndOfStream)
    {
        // your code to process the line
        // YourDoSomething(line);
    }
}
 
Share this answer
 
v2
Comments
Maciej Los 18-Mar-21 3:16am    
5ed!
Use a StringReader.

StringReader Class (System.IO) | Microsoft Docs[^]

For even higher performance, you can use a BinaryReader.

BinaryReader Class (System.IO) | Microsoft Docs[^]
 
Share this answer
 
Comments
Maciej Los 18-Mar-21 3:17am    
5ed!
Quote:
Quickest way to retrieve data from a very large text file?

The answer is : depends on what you mean by 'retrieve data'.
- Do you need to process all lines ?
- Do you know lines numbers of lines you want to process ?
- Do you need to need to process lines 1 by 1 and forget them after ?
- Do you need to find a line by its contain ?
- Is the file grow or change with time ?

The answer to those questions imply a different way to speed up the operation. Loading the whole file in lines is only the easiest for you.

File functions allow you to read line by line, jump to a position.
You can also load the file in a database.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900