Click here to Skip to main content
15,885,278 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi. I would like to ask some help regarding on my utility which I have been working out right now.
I want to extract the lines according to what date and time and IP address.
This is the example file: IPAddress, DateandTime, Get value,


172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114

I don't know how to parse this into 3 parts so that I can make a filter on each.
What I have already done was only to output if there is certain line contains "172.21.128.221"

which is:

using (StreamReader reader = new StreamReader(txtSource))
           {
               using (StreamWriter writer = new StreamWriter(txtOutput))
               {
                   string line;
                   while ((line = reader.ReadLine()) != null)
                   {
                       if (line.Contains(Filter.Text)) // Filter.Text = "172.21.128.221"
                       {
                           writer.WriteLine(line);
                           counter++;
                           dr[col1] = line;
                       }
                   }
                   dt.Rows.Add(dr);
                   dgvResult.DataSource = dt;
               }
           }
Posted

Without knowing more about your data and what you want to do with it, I would suggest a regex. Assuming you have already broken your text into lines:
C#
public static Regex regex = new Regex("(?<IPAddr>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?\\[(?<Date>.*?)\\] (?<Action>.*)",
                                      RegexOptions.CultureInvariant | RegexOptions.Compiled);

    ...
    string InputText = "172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] \"GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0\" 304 -";
    Match m = regex.Match(InputText);
    if (m.Groups["IPAddr"].Value == Filter.text)
        {
        string date = m.Groups["Date"].Value;
        string action = m.Groups["Action"].Value;
        ...
        }


"Oh. I'm sorry. Here.
Given that I have this data:


172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114 
10.144.100.64 - - [22/Jan/2013:16:00:00 +0900] "POST /iv/iv_lite/files/iv.key HTTP/1.0" 200 114 

and I would like to extract the data with the Date: [22/Jan/2013:16:00:00 +0900] and the Method: GET

How would I make my condition?"


Try:
(?<IPAddr>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?\\[(?<Date>.*?)\\]\\s\"(?<Method>\\w+)(?<Action>.*)

You will then get a Group called "Method" added to your Match

Get a copy of Expresso [^] - it's free, and it examines and generates Regular expressions.
 
Share this answer
 
v3
Comments
Maciej Los 12-Feb-13 15:11pm    
Good work! +5!
lee gee 12-Feb-13 22:16pm    
Thank you so much for the reply. :)
This also works too.
OriginalGriff 13-Feb-13 4:09am    
You're welcome!
lee gee 13-Feb-13 1:20am    
How can I make a condition if I want to filter the IP and the METHOD?
OriginalGriff 13-Feb-13 4:10am    
What parts are you looking for?
Try giving an example - remember I can't see your screen, and have no idea what terms you use for what - and I hate guessing! :laugh:
To parse parts from single line, you can use regular expression[^] like this:
^(?<ip>\S*) - - \[(?<date>.*)\] "(?<method>\S+) (?<url>.+)" (?<code>\d+) (?<tail>.+)

Parts will be accessible under ip, date, method, url, code and tail named groups like this:
C#
if (line.StartsWith("172.21.128.221"))
{
    var match = ParseLineRegex.Match(line);

    if (match.Success)
    {
        var ip = match.Groups["ip"].Value;
        var date = match.Groups["date"].Value;
        // and so on...
    }
}

For the performance reasons, make sure that your regex instance is created only once and is compiled:
C#
static readonly Regex ParseLineRegex = new Regex(@"^(?<ip>\S*)...", RegexOptions.Compiled)
 
Share this answer
 
v3
Comments
Maciej Los 12-Feb-13 15:11pm    
Fully Professional! +5!
Matej Hlatky 12-Feb-13 15:17pm    
Thank you!
lee gee 12-Feb-13 21:58pm    
Hi. Thank you for the reply. I will try it right away. :)
lee gee 12-Feb-13 22:15pm    
Thank you so much for the help! It really works :)
I hope you can still help me as I go with my utility.
lee gee 13-Feb-13 1:19am    
How can I make a condition if I want to filter the IP and the METHOD?
Following can be your code flows:
1. start reading file line by line (You are already doing that)
2. read each line character by character
foreach(character ch in strLine)
{
}
3. start appending a string till you get a space
4. as soon as you get a space, save appended string in IP address variable and make appended string as blank
5. now start skipping all the characters till you get opening square bracket '['
6. Again start appending till you get a closeing square bracket ']'
7. As soon as you get a ']' save appended string as dtDatetime variable and make appended string as blank
8. now start skipping all the characters till you get double quotes '"'
9. Again start appending till you again get double quotes '"'
10. As soon as you get '"' second time, save appended string as getValue
11. repeat step 2 to 10 till end of file.

Hope this will help you.

~Amol
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900