Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ C Linux Parsing
How to parse the following code:-
COMMAND     PID       USER   FD      TYPE             DEVICE  SIZE/OFF       NODE NAME
init          1        ???  exe       ???                ???       ???        ??? /init
init          1        ???    0       ???                ???       ???        ??? /dev/__null__ (deleted)
init          1        ???    1       ???                ???       ???        ??? /dev/__null__ (deleted)
init          1        ???    2       ???                ???       ???        ??? /dev/__null__ (deleted)
init          1        ???    3       ???                ???       ???        ??? /dev/__kmsg__ (deleted)
init          1        ???    4       ???                ???       ???        ??? /dev/__properties__ (deleted)
init          1        ???    5       ???                ???       ???        ??? socket:[257]
init          1        ???    6       ???                ???       ???        ??? socket:[259]
init          1        ???    7       ???                ???       ???        ??? socket:[260]
init          1        ???  mem       ???              00:01         0         19 /init
init          1        ???  mem       ???              00:01     90112         19 /init
<snip>
 

I want to get the pid(that is the value under pid column) to in an array,but as u can see the integers under pid column repeats,i want each integer value only once....and then return the array that contains the integers(pid) from the function
Posted 7-Sep-12 4:32am
Edited 7-Sep-12 5:41am
Wes Aday94.3K
v2
Comments
Kuthuparakkal at 7-Sep-12 9:36am
   
RegEx may be an option
Tarun Batra at 7-Sep-12 9:38am
   
i don't think so,can u explain in detail
Kuthuparakkal at 7-Sep-12 9:48am
   
http://stackoverflow.com/questions/6689956/atl-regex-to-parse-csv-files
http://stackoverflow.com/questions/1120140/csv-parser-in-c
http://www.cplusplus.com/forum/general/13087/
http://www.daniweb.com/software-development/c/threads/97843/parsing-a-csv-file-in-c
 
Please get inspired from the above links, I dont have an exact solution
Wes Aday at 7-Sep-12 9:42am
   
Was it really neccessary to post all of that? Did you try tokenizing?
Tarun Batra at 7-Sep-12 9:45am
   
Sorry will take care of that
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

pseudo code :
 
1. read line.
2. use strtok (for example or any other parsing mechanism) to parse the line;
3. get the 2nd column value
4. insert the value into a std::set<int>
5. goto 1.
6. copy the std::set into an array or a std::vector <int>
 
added usage of std::set
std::set<int> pidSet;
while (readline)
{
  // extract pidValue from line.
  pidSet.insert(pidValue);
}
 

 

you could put the value directly into a std::vector and use an algorithm to clear the duplicate values (I'm certain there is something buildin STL to do just that).
  Permalink  
v2
Comments
Tarun Batra at 7-Sep-12 9:46am
   
Sir can u explain by pseudo code the 4 step u mentioned
JackDingler at 7-Sep-12 10:49am
   
set.pushback(pid);
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

If you only want the pid, you can use a std::map<> to insure you have a unique set of values.
 
std::map<int,> PidMap;
 
bool AddPid(int NewPid)
{
  // Check to see if the pid is already in the map
  if (PidMap.find(NewPid) != PidMap.end();
     return false;
 
  PidMap.insert(std::pair<int,>(NewPid, true));
 
  return true;
}
 
bool ProcessPids(void)
{
   std::map<int,>::iterator Iter = PidMap.begin();
   std::map<int,>::iterator End  = PidMap.end();
   for (; Iter != End; Iter++)
   {
       int Pid = (*Iter).first;
 
       // Do stuff with the Pid here.
   }
 
   return true;
}
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

Did you not learn something from the last time we went through this issue?
Remember, (virtually) all complex problems can be broken down into a series of smaller, simpler ones.
 
Often the important step is to identify where/how the original problem can be broken down into smaller tasks. This skill improves with experience, though it is also a necessary way to look at the problem in order to minimize the time taken to solve it and debug it.
 

Applying this technique to your current question may be a worth exercise.
 
Summary of requirements/observations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. We need to extract the values from each line of a text file
2. These values may be duplicated - we only want to store a single instance of each value.
3. Since values may be duplicated, we have no direct way of determining the number of unique values, HOWEVER - we do know that it must be less than (or equal to) the number of lines in the input file.
 
Choices:
1. Management of duplicate values
---------------------------------
A. only maintain a list of distinct values. This will require examining as-many-as all previous values for each line of the input file that we process. Line 1 will need to be compared against 0 previous values. Line 1000 will need to be compared to up to 999 lines - less if the PID is found early, 999 compares if it isn't found.
 
B. retain all values, sort, copy 1 instance of each distinct value found to a new list.
 
C. Use a stl datastructure that will take care of duplicates for you (std::map comes to mind, though I've never used it)
 

2. Extraction of data items
---------------------------
A. tokenize the line, extract and use the tokens that you need - more complex & robust
B. extract the required elements directly - simpler and more error-prone.
 

 
Here's some code I smashed together while the kettle was boiling.
It will
(a) count the number of lines in the file
(b) display the COMMAND and PID columns of each line
 
You'll have to work out your preferred method for each of the 2 choices I've outlined above.
Good luck!
 
#include <cstdio>

int countTextFileLines(char *filename)
{
    int i=0;
    char lineBuffer[1024];
    FILE *fp = fopen(filename, "rt");
    while (fgets(lineBuffer, 1024, fp))
        i++;
    fclose(fp);
    return i;
}
 
void displayPIDcolumn(char *filename)
{
    int pID;
    char lineBuffer[1024], cmdBuffer[20];
    FILE *fp = fopen(filename, "rt");
    while (fgets(lineBuffer, 1024, fp))
    {
        sscanf(lineBuffer, "%9s %5d", &cmdBuffer, &pID);
        printf("CMD: %s   PID: %d\n", cmdBuffer, pID);
    }
}
 
int main()
{
    char *filename = "topListing.txt";
    printf("%s has %d lines.\n", filename, countTextFileLines(filename));
    displayPIDcolumn(filename);
 
    return 0;
}
  Permalink  
Comments
Tarun Batra at 10-Sep-12 10:55am
   
sir your code work correctly but shows garbage value in first two lines :C:\NEWlsof1.txt has 1612 lines.
CMD: COMMAND PID: -858993460
CMD: COMMAND PID: -858993460
CMD: init PID: 1
CMD: init PID: 1
How to remove these?
enhzflep at 10-Sep-12 13:12pm
   
By cleaning-up your input file. Or having a think about the problem. So, you're printing 2 lines of garbage (It looks like your header line is repeated). So, how do we get the data that is displayed for each line? Simple, with 'fgets' - so how do we ignore the first 2 lines - simple, call fgets 2 times before you start actually printing the data.
 
Of course, you could just delete the first 2 lines from the text file - though being a listing from 'top' it seems likely that you want to get this data 'on the fly' - in that case, it may be a better idea to throw-away a couple of lines by -using fgets, but without actually doing anything with the result.
 
Incidently, when I ran the code using the example listing of output from 'top' - (I forget which post it was in now) I got 1 junk-line. This code (obviously?) just throws away the first 1 value. Cheers
 

void displayPIDcolumn(char *filename)
{
int pID;
char lineBuffer[1024], cmdBuffer[20];
FILE *fp = fopen(filename, "rt");
 
fgets(lineBuffer, 1024, fp); // read a line and throw it away - this is the header-row
 
while (fgets(lineBuffer, 1024, fp))
{
sscanf(lineBuffer, "%9s %5d", &cmdBuffer, &pID);
printf("CMD: %s PID: %d\n", cmdBuffer, pID);
}
}

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 304
1 Sergey Alexandrovich Kryukov 255
2 Shweta N Mishra 216
3 Maciej Los 210
4 PIEBALDconsult 174
0 OriginalGriff 7,660
1 Sergey Alexandrovich Kryukov 7,072
2 DamithSL 5,586
3 Manas Bhardwaj 4,946
4 Maciej Los 4,665


Advertise | Privacy | Mobile
Web01 | 2.8.1411023.1 | Last Updated 7 Sep 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100