Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#3.0 C#
Hi. I would like to ask some help regarding on my utility which I have been working out right now.
I want to extract the lines according to what date and time and IP address.
This is the example file: IPAddress, DateandTime, Get value,
 

172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114
 
I don't know how to parse this into 3 parts so that I can make a filter on each.
What I have already done was only to output if there is certain line contains "172.21.128.221"
 
which is:
 
using (StreamReader reader = new StreamReader(txtSource))
           {
               using (StreamWriter writer = new StreamWriter(txtOutput))
               {
                   string line;
                   while ((line = reader.ReadLine()) != null)
                   {
                       if (line.Contains(Filter.Text)) // Filter.Text = "172.21.128.221"
                       {
                           writer.WriteLine(line);
                           counter++;
                           dr[col1] = line;
                       }
                   }
                   dt.Rows.Add(dr);
                   dgvResult.DataSource = dt;
               }
           }
Posted 11-Feb-13 22:38pm
lee gee372
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Without knowing more about your data and what you want to do with it, I would suggest a regex. Assuming you have already broken your text into lines:
public static Regex regex = new Regex("(?<IPAddr>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?\\[(?<Date>.*?)\\] (?<Action>.*)",
                                      RegexOptions.CultureInvariant | RegexOptions.Compiled);
 
    ...
    string InputText = "172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] \"GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0\" 304 -";
    Match m = regex.Match(InputText);
    if (m.Groups["IPAddr"].Value == Filter.text)
        {
        string date = m.Groups["Date"].Value;
        string action = m.Groups["Action"].Value;
        ...
        }
 
"Oh. I'm sorry. Here.
Given that I have this data:

 
172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114 
10.144.100.64 - - [22/Jan/2013:16:00:00 +0900] "POST /iv/iv_lite/files/iv.key HTTP/1.0" 200 114 
and I would like to extract the data with the Date: [22/Jan/2013:16:00:00 +0900] and the Method: GET
 
How would I make my condition?"

 
Try:
(?<IPAddr>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?\\[(?<Date>.*?)\\]\\s\"(?<Method>\\w+)(?<Action>.*)
You will then get a Group called "Method" added to your Match
 
Get a copy of Expresso [^] - it's free, and it examines and generates Regular expressions.
  Permalink  
v3
Comments
Maciej Los at 12-Feb-13 15:11pm
   
Good work! +5!
lee gee at 12-Feb-13 22:16pm
   
Thank you so much for the reply. :)
This also works too.
OriginalGriff at 13-Feb-13 4:09am
   
You're welcome!
lee gee at 13-Feb-13 1:20am
   
How can I make a condition if I want to filter the IP and the METHOD?
OriginalGriff at 13-Feb-13 4:10am
   
What parts are you looking for?
Try giving an example - remember I can't see your screen, and have no idea what terms you use for what - and I hate guessing! :laugh:
lee gee at 13-Feb-13 4:21am
   
Oh. I'm sorry. Here.
Given that I have this data:
 
172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114
10.144.100.64 - - [22/Jan/2013:16:00:00 +0900] "POST /iv/iv_lite/files/iv.key HTTP/1.0" 200 114
 
and I would like to extract the data with the Date: [22/Jan/2013:16:00:00 +0900] and the Method: GET
 
How would I make my condition?
 
Expected Output:
172.21.128.221 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_1_5_1/js/annt/con_annt_resize.js?v=200711220 HTTP/1.0" 304 -
10.144.100.63 - - [22/Jan/2013:16:00:00 +0900] "GET /iv/iv_lite/files/iv.key HTTP/1.0" 200 114
OriginalGriff at 13-Feb-13 4:34am
   
Answer updated
OriginalGriff at 15-Feb-13 3:56am
   
No idea - what are you trying to extract?
(Did you get a copy of expresso? It can help you do this and may be faster than waiting for a response here)
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

To parse parts from single line, you can use regular expression[^] like this:
^(?<ip>\S*) - - \[(?<date>.*)\] "(?<method>\S+) (?<url>.+)" (?<code>\d+) (?<tail>.+)
Parts will be accessible under ip, date, method, url, code and tail named groups like this:
if (line.StartsWith("172.21.128.221"))
{
    var match = ParseLineRegex.Match(line);
 
    if (match.Success)
    {
        var ip = match.Groups["ip"].Value;
        var date = match.Groups["date"].Value;
        // and so on...
    }
}
For the performance reasons, make sure that your regex instance is created only once and is compiled:
static readonly Regex ParseLineRegex = new Regex(@"^(?<ip>\S*)...", RegexOptions.Compiled)
  Permalink  
v3
Comments
Maciej Los at 12-Feb-13 15:11pm
   
Fully Professional! +5!
Matej Hlatky at 12-Feb-13 15:17pm
   
Thank you!
lee gee at 12-Feb-13 21:58pm
   
Hi. Thank you for the reply. I will try it right away. :)
lee gee at 12-Feb-13 22:15pm
   
Thank you so much for the help! It really works :)
I hope you can still help me as I go with my utility.
lee gee at 13-Feb-13 1:19am
   
How can I make a condition if I want to filter the IP and the METHOD?
Matej Hlatky at 13-Feb-13 4:51am
   
You can generate your Regex pattern dynamically before the main loop. Just remember to use Regex.Escape(string) on the input (IP, method or whatever) filter.
lee gee at 13-Feb-13 20:38pm
   
How do i parse this one?
 
2012:10:19 07:24:19 - DB ACCESS: db_query() - SELECT TEMPLATE_X, TEMPLATE_Y, TEMPLATE_W, TEMPLATE_H FROM TBL_CM_WTO WHERE IMG_URL='HTTP://APOST0011:9081/WTO-CCARD-SP-3//B100/SS001/104/2012/10/19/0000803557//SS001-2012101903820.jpg?SYSID=104&TxnCd=LO000000000000246773&QID=QMANULMASK&TOKDOCCODE=NA&TOKUSERID=NA&TOKEN=CC67DC70C4F2B5A5E04013AC3F6263E3&docTDesc= Application_form&localeId=EN&Orientation=90&UserCD=961887&' AND API_OP_CODE = 0;
 
2012:10:19 07:24:19 - HTTP PARAMS: SYSTEM_ID = 104& DOCUMENT_ID = AP000000000000255686& DOCUMENT_TYPE_CODE = Application& TEMPLATE_ID = GLF_F& BATCH_CODE = 00000000000000682499& IMAGE_ID = 00000000000001001967& IMAGE_URL = HTTP://APOST0011:9081/WTO-CCARD-SP-3//B100/SS001/104/2012/10/19/0000803557//SS001-2012101903820.jpg?SYSID=104&TxnCd=LO000000000000246773&QID=QMANULMASK&TOKDOCCODE=NA&TOKUSERID=NA&TOKEN=CC67DC70C4F2B5A5E04013AC3F6263E3&docTDesc= Application_form&localeId=EN&Orientation=90&UserCD=961887&& ORIENTATION = 90& USER_CD = 961887& LOCALE = EN
 
2012:10:19 07:44:46 - DB ACCESS: db_query() - SELECT TEMPLATE_X, TEMPLATE_Y, TEMPLATE_W, TEMPLATE_H FROM TBL_CM_WTO WHERE IMG_URL='http://apost0011:9081/WTO-CCARD-SP-3//B100/SS001/104/2012/10/19/0000803753//SS001-2012101903830.jpg?' AND API_OP_CODE = 0;
 
2012:10:19 07:44:46 - HTTP PARAMS: SYSTEM_ID = 104& DOCUMENT_ID = AP000000000000255689& DOCUMENT_TYPE_CODE = & TEMPLATE_ID = & BATCH_CODE = & IMAGE_ID = & IMAGE_URL = http://apost0011:9081/WTO-CCARD-SP-3//B100/SS001/104/2012/10/19/0000803753//SS001-2012101903830.jpg?& ORIENTATION = & USER_CD = & LOCALE =
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Following can be your code flows:
1. start reading file line by line (You are already doing that)
2. read each line character by character
foreach(character ch in strLine)
{
}
3. start appending a string till you get a space
4. as soon as you get a space, save appended string in IP address variable and make appended string as blank
5. now start skipping all the characters till you get opening square bracket '['
6. Again start appending till you get a closeing square bracket ']'
7. As soon as you get a ']' save appended string as dtDatetime variable and make appended string as blank
8. now start skipping all the characters till you get double quotes '"'
9. Again start appending till you again get double quotes '"'
10. As soon as you get '"' second time, save appended string as getValue
11. repeat step 2 to 10 till end of file.
 
Hope this will help you.
 
~Amol
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 195
1 ProgramFOX 130
2 Maciej Los 105
3 Sergey Alexandrovich Kryukov 105
4 Afzaal Ahmad Zeeshan 82
0 OriginalGriff 6,564
1 Sergey Alexandrovich Kryukov 6,048
2 DamithSL 5,228
3 Manas Bhardwaj 4,717
4 Maciej Los 4,150


Advertise | Privacy | Mobile
Web03 | 2.8.1411022.1 | Last Updated 13 Feb 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100