Click here to Skip to main content
15,886,664 members
Articles / Programming Languages / C#
Tip/Trick

Splitting a file up into a list based on certain characters

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
23 Jan 2013CPOL4 min read 14.8K   1   7
A useful piece of code for reading in and parsing a file.

Introduction 

The following piece of code allows for the user to read in a file, and then split (or parse) up its data depending on specified characters and place it into a List. For instance if you want each new line of a file to be a new index in a List.

Background 

I constantly am writing programs that store data to a file (specifically a .txt file). Because of this, I wanted an easy way to read in the data and handle it. To do this I came up with this class that has since been used multiple times in many of my programs.

The Code

So here we have the code to perform the reading and splitting of data. It's a rather simple design that can be implemented into the program as its own class. Passed in the file name, and a List<string> of characters you want that when they occur in the file read in, will break the string and create a new index in the List<string> that will be returned.

So for now let's just take a look at the code you'll need, and I'll explain it more in detail in the next section.

C#
class readInFile
{
    public static List<string> readIn (string fileName, List<string> lineSplit)
    {
        List<string> input = new List<string>();

	try
	{
	    if (File.Exists(fileName)) //check if the file exists
	    {
	        string temp = File.ReadAllText(fileName); //read in the whole file to a string
				
		input.AddRange(temp.Split(lineSplit.ToArray(),
                               StringSplitOptions.RemoveEmptyEntries)); 
	        //splits the data using entries in lineSplit
	    }
	    else //file does not exist (that was selected)
	    {
	        MessageBox.Show("The File You Are Trying To Read In Does\n
                                Not Exist Please Select An Exisiting File", 
				"File Not Found", MessageBoxButtons.OK, MessageBoxIcon.Error);
				
	    }
	}
	catch (Exception error) //an error occured in the read in
	{
            MessageBox.Show(error.ToString());
	}
	return input;
    }
}

So as you can see the following code is rather simple. You will need to make sure that you include the following references

C#
using System.IO; 

This is used for the File.Exists( ) statement. The other reference is 

C#
using System.Windows.Forms; 

This is used for the MessageBox.Show( ). It should be noted that these are based on Windows Forms, meaning if you want to use this in a console application, these should be changed out. The first one in the else statement can be whatever you want. The other one in catch statement you should keep the "error.ToString( )" part, based off personal experience.


Understanding the Code and Using it to Your Needs

Okay so now that you've seen the code, I wanted to point out a few parts of it. Partially so you can understand it better, but also so you know how to use it, and how to tweak it to your needs.

First of all let's look at the following line  

C#
string temp = File.ReadAllText(fileName);

The reason I wanted to point out this line of code is because the documents I read in from are .txt files. However if I were to try other text documents, the results might not be what I hope for.

But that doesn't mean this is then limited to .txt files. If you wish to read in a file from another format and this line doesn't work, all you have to do is change it. Just making sure that when all is said and done, the document you want split up is stored as one string called "temp".

The next line I wanted to point out, and this is the main part of this code, is

C#
input.AddRange(temp.Split(lineSplit.ToArray(), StringSplitOptions.RemoveEmptyEntries));

The following code takes the string (temp), scans through it, and splits it up, creating a new index each time a desired character(s) is found. To do this you need to pass in a List with the character(s) to look for. A good example is "\r\n". When this is used the program will split temp up so that each new line in a file is a new index within the List<string> that will be returned.

However if you would rather not pass in a List<string> and would instead hard code it into the class, you may use the following line. As an example, I have this piece of code checking for new lines and commas.

C#
input.AddRange(temp.Split(new String [] { "\r\n", "," }, StringSplitOptions.RemoveEmptyEntries)); 

It should be noted that these characters you use to look for do not get copied into the List.

Conclusion

So there you have it. I know a lot of text for such a simple class, but it's a useful one. If you wish to use this code, I have included a download of it that is formatted better then the code you see here.

If there is any questions or concerns or anything else, please don't hesistate to ask, even if it means some example code showing how this works.

Points of Interest

I have to say that this class has become extremely useful for me, especially when I want to save simple data, I can simply use a text file and this code.

History

Version 1.0: Initial code posted.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Student
United States United States
I am current a college student working on my Computer Science Security degree.

I quickly fell in love with programming after I started my first CS class (Java ... the first programming language I really learned). Since then I write programs for fun and find much enjoyment in them.

Comments and Discussions

 
QuestionNot Just for New Lines Pin
James IV24-Jan-13 11:40
James IV24-Jan-13 11:40 
While you guys do present much easier ways to split a document up by simply new lines, this parser does more then just that. The concept behind it was to allow for any variation that the program chooses to use.

For instance I had a program where part of it read in a text file and then would parse up the data and create an SQLite file. The files were .txt files that contain password collections (or dumps from like a site). These files could easily contain over 10,000 entries, and varied on how they were split up depending on where you got it from. Because of this the user actually got to choose what options they wanted for parsing up the file. No need to format the existing documents, just plug it in, select the options and go.

Now I did scan the Riva thing, but I am somewhat tired right now, so I'll look over it again.
GeneralThoughts Pin
PIEBALDconsult23-Jan-13 17:30
mvePIEBALDconsult23-Jan-13 17:30 
GeneralRe: Thoughts Pin
Andrew Rissing24-Jan-13 4:00
Andrew Rissing24-Jan-13 4:00 
GeneralRe: Thoughts Pin
PIEBALDconsult24-Jan-13 4:45
mvePIEBALDconsult24-Jan-13 4:45 
GeneralRe: Thoughts Pin
Andrew Rissing24-Jan-13 5:04
Andrew Rissing24-Jan-13 5:04 
GeneralRe: Thoughts Pin
PIEBALDconsult24-Jan-13 5:06
mvePIEBALDconsult24-Jan-13 5:06 
GeneralRe: Thoughts Pin
Andrew Rissing24-Jan-13 5:13
Andrew Rissing24-Jan-13 5:13 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.