Click here to Skip to main content
15,885,365 members
Articles / Desktop Programming / MFC
Article

TextFileSplitter - A Class to Split Text Files

Rate me:
Please Sign up or sign in to vote.
3.33/5 (3 votes)
16 Oct 20022 min read 174.8K   3.3K   17   12
A class for splitting a large text file into equal amounts of smaller sub_text files

Image 1

Introduction

The TextSplit class is useful when the need arises to split a text file into smaller sub_text files. The class's default constructor takes 2 parameters, the path/filename and how many lines each sub_text file should contain. 

Real-world example: I wrote a console application importing networks & subnet mask information from a Cisco Router. I would then pipe the results into a command file, where each line of the command file will launch a separate discovery process. Now if you have a lot of networks (say a 1000) you may want to automatically create sub_command files where only 10 processes are started each time. With this class, I can split the original command file up into several smaller ones.

This class have been designed using the standard <fstream> libraries and should integrate easy with your console application.

Using the class

  1. First, create a console application and copy textsplit.h & textsplit.cpp to your current folder.
  2. Then, in your implantation file, include the header:
  3. #include "textsplit.h"
  4. Add the textplit.cpp to your project, Select File-View, right-click on your project and select "add files to project"
  5. Create a TextSplit object and call the CreateOutPutFiles() method
  6. TextSplit R(fileName, howManyLines); 
    R.CreateOutputFiles();

First, the fileName object will be validated, and depending on how many lines there are in the input file and the maximum number of lines you want, the correct number of output files will be created in the format: x_filename.extention (where x is the numerical value). If there are any remainder lines that is less than the maximum specified, they will be included in the last file. The example program included in this article provides an input file (test.txt) with 10 lines, there must be a maximum of 3 lines in each sub_text file. You will have the following output:

  • 1_text.txt (line1-3)
  • 2_text.txt (line 4-6)
  • 3_text.txt (line 7-9)
  • 4_text.txt (line 10)

Update (2002/10/15)

Special thanks to Hernan Berguan for pointing out the 1000 line limitation, I used normal Arrays for holding the each line from the source file (... and we all know the limitations of arrays), so instead I decided on the vector<string> class. I tested demo program with a text file of 500,000 lines, creating sub_text files with 50,000 lines each. As a final test I used one of the sub_text files as source and create 1000 sub_"sub"_text files with 50 lines each. The 50 line is the default if the user input is < 1 (Thanks again to Hernan). It seems that everything is working smooth now! 

Summary

Here is a list of the public interfaces of the TextSplit class

// default constructor 
TextSplit(string FileName, int numberOfLinesForEachFile); 

void CreateOutputFiles(); 

// return number of lines in text file.
int GetNumberOfFiles() const;

Note that there is another function worth mentioning, GetNumberOfFiles() (This function returns the number of lines for the input file).

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
South Africa South Africa
I am an IT security consultant that focuses mainly in Oracle, Microsoft, Citrix, RSA, SUN, and Linux security. What I do is Perimeter security design (firewalls, IDS etc) as well as internal / external network assessments (penetration).

On the programming side I’m sellable on VC++. I’m also strong using c# (asp.net) TSQL, VC++.Net, STL, COM, ATL, Java/VBscript / Coldfusion /

Home page: http://www.starbal.net

Comments and Discussions

 
Generalsplitting at specific position in a text Pin
dreyfus22-Apr-08 1:22
dreyfus22-Apr-08 1:22 
GeneralRe: splitting at specific position in a text Pin
nums23-Apr-08 2:37
nums23-Apr-08 2:37 
GeneralTextFileSplitter Pin
shahjayesh152-May-07 19:44
shahjayesh152-May-07 19:44 
QuestionAbout text files combination! Pin
snailflying15-Apr-07 23:59
snailflying15-Apr-07 23:59 
Generalcrash Pin
dan o30-Jan-04 1:30
dan o30-Jan-04 1:30 
QuestionLine limit? Pin
nacnuduk197517-Nov-03 14:20
nacnuduk197517-Nov-03 14:20 
AnswerRe: Line limit? Pin
nums17-Nov-03 20:55
nums17-Nov-03 20:55 
GeneralRe: Line limit? Pin
nacnud_uk21-Nov-03 6:07
nacnud_uk21-Nov-03 6:07 
Generalvector&lt;string&gt; Pin
tja0129-Oct-03 11:25
tja0129-Oct-03 11:25 
Confused | :confused:

The vector<string> line in the .h file causes several errors in VC++ 6.0

It seems correct, does anyone have any idea what the problem might be?



Peace,

Tom
GeneralRe: vector&lt;string&gt; Pin
nums4-Nov-03 2:58
nums4-Nov-03 2:58 
GeneralLink dos not work Pin
User 269421-Oct-02 22:55
professionalUser 269421-Oct-02 22:55 
GeneralRe: Link dos not work Pin
Pieter (nums)7-Oct-02 0:12
sussPieter (nums)7-Oct-02 0:12 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.