Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ MFC
I have created a program to parse a csv file, but I'm having a problem with the consistency. I think my problem lies in where the app begins reading the information. For some reason the file starts reading near the end of the file, and it starts at that location every time.
 
I'm doing the following:
 
CString myBuffer = (_T(""));
CFile myFile;
CFileException e;
BYTE buffer[0x1000];
 
if(myFile(_T("c:\\myDir\\file.csv"), CFile::modeRead, &e) == 0)
{
    e.ReportError();
    e.Delete();
}
 
DWORD dwBytesRemaing = myFile.GetLength();
 
while(dwBytesRemaining)
{
    UINT nBytesRead = myFile.Read(buffer, sizeof(buffer));
    dwBytesRemaining -= nBytesRemaining;
}
 
Is there something wrong with this approach that would make this start reading 3/4 into the file?
Posted 18-Dec-11 7:41am
DrBones691.9K
Edited 18-Dec-11 7:42am
v2
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Looks like you're currently reading the file in portions to buffer but in each iteration you overwrite the contents of the buffer. You should add a string to your program where you concatenate the whole content. Something like:
while(dwBytesRemaining)
{
    UINT nBytesRead = myFile.Read(buffer, sizeof(buffer));
    content += buffer;
    dwBytesRemaining -= nBytesRemaining;
}
  Permalink  
Comments
DrBones69 at 18-Dec-11 14:10pm
   
That's exactly what was happening! Thanks!
+5
Mika Wendelius at 18-Dec-11 14:22pm
   
You're welcome :)
DrBones69 at 18-Dec-11 15:34pm
   
My logic is screwed up, there are 12 strings I need to capture. The lines are made up like this:
 
"last name, first name(No Comma) middle name",street address,city,state,zip,DOB,sex,race,hair,eyes,height
 
I'm reading chars, when a quote is reached I skip it, but it adds a space to the array as a string. How do I keep this from happening? How would I parse the first and middle name so that it would separate them into 2 strings?
 
int nValue = 0;
LPCTSTR p = myBuffer;

//Read until p == end
while (*p != '\0')
{
CString s; // String to hold this value
 
while (*p != '\0' && *p != ',') //read each char until ',' or end of file
{
if(*p == '"') //If a quote, don't add
break;
s.AppendChar(*p++);
}
// Advance to next character (if not already end of string)
if (*p != '\0')
p++;
// Add this string to value array
if (nValue < arr.GetCount())
arr[nValue] = s;
else
arr.Add(s);
nValue++;
}
Mika Wendelius at 18-Dec-11 15:50pm
   
What if you first read the file line by line: http://msdn.microsoft.com/en-us/library/x5t0zfyf(v=vs.80).aspx[^].
 
When you get the line split it based on commas. Now you should have an array for each element. And the last thin would be to break the names into two parts based on the first occurrence of space(?). See also: http://oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html[^] / Tokenize
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Yes, your approach is wrong: you are reading all the parts of your file into the same chunk, overwriting, with each reading, previous buffer content. You should either:
  • Process a chunk of data at time, e.g.
    while(dwBytesRemaining)
    {
        UINT nBytesRead = myFile.Read(buffer, sizeof(buffer));
        process(buffer, nBytesRead); // process your chunk of data
        dwBytesRemaining -= nBytesRemaining;
    }
    
 
or
  • Read the whole file at once into a (bigger) buffer (of course this approach might be unpractical for very large files).
 
or, finally
  • Read a line of the CSV file at time, that is second the natural structure of the CSV format.
 
[added]
Using CStdioFile for line by line reading is really simple, e.g.
 
    CStdioFile sf;
    if (! sf.Open(_T("c:\\myDir\\file.csv"),CFile::modeRead) )
    {
      // handle error here
    }
    CString line;
    while ( sf.ReadString(line))
    {
      // do line parsing here
    }
[/added]
  Permalink  
v3
Comments
DrBones69 at 18-Dec-11 14:16pm
   
I'm adding this buffer to a CString variable and then parsing the information using a LPCTSTR to read a each char and then saving this to a CStringArray.
 
Thanks for suggestions!
+5
DrBones69 at 18-Dec-11 23:30pm
   
How do I use CStdioFile?
 
I read MSDN and it is kindof vague, I would like to convert from CFile to CStdioFile so I can use the CStdioFile::ReadString member. Do I read the file line by line into a buffer until the buffer holds the whole file or do I read one line into the buffer then parse? Below is working with about 85% efficiently(unacceptable!)
 
/************************************************************************/
/* Create handle to my file and open it */
/************************************************************************/
CFile myFile;
CFileException e;
BYTE buffer[0x1000]; //4kb(4,096Bytes) buffer to hold csv file
 
if(myFile.Open(_T("c:\\pawn\\casino\\accounts.csv"), CFile::modeRead, &e) == 0)
{
e.ReportError();
e.Delete();
}
 
DWORD dwBytesRemaining = myFile.GetLength();
/************************************************************************/
/* Read file into buffer, while created a new buffer and adding to it */
/* 4KB at a time. */
/************************************************************************/
while(dwBytesRemaining)
{
UINT nBytesRead = myFile.Read(buffer, sizeof(buffer));
for(int i = 0; i <= sizeof(buffer); i++)
{
myBuffer += buffer[i];
}
dwBytesRemaining -= nBytesRead;
}
DrBones69 at 19-Dec-11 23:10pm
   
@ CPallini, I have invoked CStdioFile, but I only get the last line in the file. Can you help me understand where I went wrong with the logic?
 
CString cStdstr(_T(""));

while(cStdFile.ReadString(cStdstr))
{
int nValues = 0;
LPCTSTR q = cStdstr;
 
while (*q != '\0') //Keep reading until end of file is reached
{
// String to hold parsed value
CString t;
if(*q == '"')
q++;
//Add characters to t until a comma or EOF
while (*q != '\0' && *q != ',')
{
if(*q == '"')
break;
t.AppendChar(*q++);
}
// Advance to next character (if not already end of string)
if (*q != '\0')
q++;
// Add this string to value array
if (nValues < cStdioArr.GetCount())
cStdioArr[nValues] = t;
else
cStdioArr.Add(t);
nValues++;
}
// Trim off any unused array values
if (cStdioArr.GetCount() > nValues)
cStdioArr.RemoveAt(nValues, cStdioArr.GetCount() - nValues);
}
DrBones69 at 20-Dec-11 23:15pm
   
Nevermind, I've found my error. :-<
My first statement in my while loop was making the array overwrite itself with each line reading.
 
Thanks for your help!
DrB :-)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 Sergey Alexandrovich Kryukov 6,725
1 OriginalGriff 6,696
2 CPallini 5,315
3 George Jonsson 3,589
4 Gihan Liyanage 2,650


Advertise | Privacy | Mobile
Web02 | 2.8.140921.1 | Last Updated 19 Dec 2011
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100