 |
|
 |
Are you going to do a VS2010 update of this code? If not, I would like to do a VS2010 version on Codeplex under MS-PL citing you as the original author.
|
|
|
|
 |
|
 |
Project is now on CodePlex at http://wingrep.codeplex.com/[^]. The code is now StyleCop compliant. I am looking at making FXCOP compliant as well.
modified 17 Feb '12.
|
|
|
|
 |
|
 |
Hello! any ideas on implementing text search within ZIP files?
We have a huge set of log files that have been zipped to save space. We now want to implement a text search program to search text within these zipped files.
Thanks,
Swapneel
|
|
|
|
 |
|
 |
I used this code in one of my projects to open up zipped log file to do search on them. It actually uses vjslib (visual java lib) Add it as reference in your C# code.
here is the code:
using System;
using System.Collections.Generic;
using System.Text;
using System.IO.Compression;
using System.IO;
using java.util;
using java.util.zip;
using java.io;
namespace PosTracker
{
class ZipReader
{
string destinationPath;
string zipFilePath;
public ZipReader(string zipFilePath, string destPath)
{
this.destinationPath = destPath;
this.zipFilePath = zipFilePath;
ZipFile zipfile = new ZipFile(this.zipFilePath);
List zipFiles = GetZipFiles(zipfile);
foreach (ZipEntry zipFile in zipFiles)
{
if (!zipFile.isDirectory())
{
InputStream s = zipfile.getInputStream(zipFile);
try
{
string outFile = Path.Combine(destinationPath, Path.GetFileName(zipFile.getName()));
// if the outfile already exists
// it means that we already have the inflated copy of the zip file
if(System.IO.File.Exists(outFile))
continue;
FileOutputStream dest = new FileOutputStream(outFile);
try
{
int len = 0;
sbyte[] buffer = new sbyte[7168];
while ((len = s.read(buffer)) >= 0)
{
dest.write(buffer, 0, len);
}
}
finally
{
dest.close();
}
}
finally
{
s.close();
}
}
}
}
~ZipReader()
{
}
private List GetZipFiles(ZipFile zipfil)
{
List lstZip = new List();
Enumeration zipEnum = zipfil.entries();
while (zipEnum.hasMoreElements())
{
ZipEntry zip = (ZipEntry)zipEnum.nextElement();
lstZip.Add(zip);
}
return lstZip;
}
}
}
Vikas Singh
Good DAY
|
|
|
|
 |
|
 |
In the console grep, to speed things up a bit, just compile your regex and move some of the code outside the while(enm.MoveNext()) loop. So, find the following code in the while loop -
//Using Regular Expressions as a real Grep
Mtch mtch;
if(m_bIgnoreCase == true)
mtch = Regex.Match(strLine, m_strRegEx, RegexOptions.IgnoreCase);
else
mtch = Regex.Match(strLine, m_strRegEx);
if(mtch.Success == true)
{
Then replace with this
//Using Regular Expressions as a real Grep
Match mtch = _xregex.Match( strLine );
if ( mtch.Success == true)
{
Add the following above the while loop
RegexOptions options = RegexOptions.Compiled;
if (m_bIgnoreCase == true)
options |= RegexOptions.IgnoreCase;
Regex _xregex = new Regex(m_strRegEx, options);
Using the original console grep, starting from the root of my C: drive, using the following search - I find 6500+ files in in about 2 minutes 40 seconds: grep /c /i /n /r /E:watch /F:*.cs
Using the compiled regex version it takes 16 seconds.
Also, add a catch(Exception ex) to the while loop to handle any unanticipated issues.
Do the same sort of thing to the GetFiles routine - add a try-catch there too.
I hope this helps.
|
|
|
|
 |
|
 |
Could I get a copy of your modified code to speed this app up? I know this is an old post and you may not have it around anymore. Thanks in advance.
Dale Sides
|
|
|
|
 |
|
 |
Hi George,
nice program but I changed the output:
1. Write each line immediatly ( it took too long for the first result on a 14 MB file)
2. I changed Console.WriteLine to Console.Write because I had a line feed too much
Thank you
Detlef
|
|
|
|
 |
|
 |
There is a bug with System.UnauthorizedAccessException, if someone selects c:\ and Recursive on then the debug version mentions that a System.UnauthorizedAccessException error was not cached and exits the loop.
Any suggestions?
|
|
|
|
 |
|
 |
It's probably because he access the GUI controls from a background thread - something you should never do.
I'd make these changes:
1) no threads, or else use the correct thread marshalling technique to update the GUI
2) use a finally to close the file
|
|
|
|
 |
|
 |
I've just downloaded the program, used it, and it works fine ! I'd also downloaded a shareware program that does about the same, but I'd have to buy it in 30 days. Your program takes 10 seconds to search and display results; shareware program takes 23 minutes. That's within a several GB file system. Gets my 5 !!
lurkb@t
|
|
|
|
 |
|
 |
I have transformed this code into a stand alone class. So I can examine large log files from an application.
If someone is interested, just send me a note.
--
Jean-Michel Bezeau
Computer Science Analyst
|
|
|
|
 |
|
 |
Hello, can you share the code that you have written for large log files? I also have a similar situation. Also I have large number of log files to parse. Any suggestions are welcome.
Thanks,
Swapneel*
|
|
|
|
 |
|
 |
Yes I can, but I don't think I can include the code in a message here. So you need to send me an email and I will include the file in the reply.
--
Jean-Michel Bezeau
Computer Science Analyst
|
|
|
|
 |
|
 |
Here is the source code for my class. I have done this a long time ago, I am no longer familiar with the code.
Imports System
Imports System.Collections
Imports System.Text.RegularExpressions
Imports System.IO
Imports System.Security
'Traditionally grep stands for "Global Regular Expression Print".
'Global means that an entire file is searched.
'Regular Expression means that a regular expression string is used to establish a search pattern.
'Print means that the command will display its findings.
'Simply put, grep searches an entire file for the pattern you want and displays its findings.
'
'The use syntax is different from the traditional Unix syntax, I prefer a syntax similar to
'csc, the C# compiler.
'
' grep [/h|/H] - Usage Help
'
' grep [/c] [/i] [/l] [/n] [/r] [/p] /E:reg_exp /F:files
'
' /c - print a count of matching lines for each input file;
' /i - ignore case in pattern;
' /l - print just files (scanning will stop on first match);
' /n - prefix each line of output with line number;
' /r - recursive search in subdirectories;
' /p - consider expression as a string literal
'
' /E:reg_exp - the Regular Expression used as search pattern. The Regular Expression can be delimited by
' quotes like "..." and '...' if you want to include in it leading or trailing blanks;
'
' /F:files - the list of input files. The files can be separated by commas as in /F:file1,file2,file3
'and wildcards can be used for their specification as in /F:*file?.txt;
'
'Example:
'
' grep /c /n /r /E:" C Sharp " /F:*.cs
Namespace grep
''' <summary>
''' Wrapper class around the grep sample code from CodeProject.
''' </summary>
Public Class grep
'Option Flags
Private m_bRecursive As Boolean
Private m_bIgnoreCase As Boolean
Private m_bJustFiles As Boolean
Private m_bLineNumbers As Boolean
Private m_bCountLines As Boolean
Private m_bPlaintext As Boolean
Private m_strRegEx As String
Private m_vRegEx() As String
Private m_strFiles As String
Private m_strDir As String
'ArrayList keeping the Files
Private m_arrFiles As New ArrayList
' Result information
'private ArrayList m_arrResultFiles = new ArrayList();
'private ArrayList m_arrResultLineNumbers = new ArrayList();
Public Event Display(ByVal strMessage As String)
Public Event FilesFound(ByVal iNumber As Integer)
Public Event DisplayHit(ByVal strFileName As String, ByVal iNbHit As Integer, ByVal strTrouve As String, ByVal strLigne As String)
Public Sub New()
End Sub 'New
'Properties
Public Property Recursive() As Boolean
Get
Return m_bRecursive
End Get
Set(ByVal Value As Boolean)
m_bRecursive = Value
End Set
End Property
Public Property IgnoreCase() As Boolean
Get
Return m_bIgnoreCase
End Get
Set(ByVal Value As Boolean)
m_bIgnoreCase = Value
End Set
End Property
Public Property JustFiles() As Boolean
Get
Return m_bJustFiles
End Get
Set(ByVal Value As Boolean)
m_bJustFiles = Value
End Set
End Property
Public Property LineNumbers() As Boolean
Get
Return m_bLineNumbers
End Get
Set(ByVal Value As Boolean)
m_bLineNumbers = Value
End Set
End Property
Public Property CountLines() As Boolean
Get
Return m_bCountLines
End Get
Set(ByVal Value As Boolean)
m_bCountLines = Value
End Set
End Property
Public Property Plaintext() As Boolean
Get
Return m_bPlaintext
End Get
Set(ByVal Value As Boolean)
m_bPlaintext = Value
End Set
End Property
Public Property Expression() As String
Get
Return m_strRegEx
End Get
Set(ByVal Value As String)
m_vRegEx = Nothing
m_strRegEx = Value
End Set
End Property
Public Property Expressions() As String()
Get
Return m_vRegEx
End Get
Set(ByVal Value As String())
m_strRegEx = Nothing
m_vRegEx = Value
End Set
End Property
Public Property Files() As String
Get
Return m_strFiles
End Get
Set(ByVal Value As String)
m_strFiles = Value
End Set
End Property
Public Property Dir() As String
Get
Return m_strDir
End Get
Set(ByVal Value As String)
m_strDir = Value
End Set
End Property
'Build the list of Files
Private Sub GetFiles(ByVal strDir As [String], ByVal strExt As [String], ByVal bRecursive As Boolean)
'search pattern can include the wild characters '*' and '?'
Dim fileList As String() = Directory.GetFiles(strDir, strExt)
For i As Integer = 0 To fileList.Length - 1
If File.Exists(fileList(i)) Then
m_arrFiles.Add(fileList(i))
End If
Next i
If bRecursive = True Then
'Get recursively from subdirectories
Dim dirList As String() = Directory.GetDirectories(strDir)
For i As Integer = 0 To dirList.Length - 1
GetFiles(dirList(i), strExt, True)
Next i
End If
End Sub 'GetFiles
'Search Function
Public Sub Search()
Dim strDir As [String] = m_strDir
'First empty the list
m_arrFiles.Clear()
'Create recursively a list with all the files complying with the criteria
Dim astrFiles As [String]() = m_strFiles.Split(New [Char]() {","c})
For i As Integer = 0 To astrFiles.Length - 1
'Eliminate white spaces
astrFiles(i) = astrFiles(i).Trim()
GetFiles(strDir, astrFiles(i), m_bRecursive)
Next i
'Now all the Files are in the ArrayList, open each one
'iteratively and look for the search string
Dim strResults As [String] = "Grep Results:" + ControlChars.Cr + ControlChars.Lf + ControlChars.Cr + ControlChars.Lf
Dim strLine As [String]
Dim exExpression As Regex
If m_vRegEx Is Nothing Then
If m_bIgnoreCase = True Then
exExpression = New Regex(m_strRegEx, RegexOptions.IgnoreCase)
Else
exExpression = New Regex(m_strRegEx)
End If
End If
Dim iLine, iCount As Integer
Dim bEmpty As Boolean = True
Dim enm As IEnumerator = m_arrFiles.GetEnumerator()
RaiseEvent FilesFound(m_arrFiles.Count)
While enm.MoveNext()
Try
Dim fi As New FileInfo(CStr(enm.Current))
Dim sr As New StreamReader(fi.OpenRead(), System.Text.Encoding.Default)
RaiseEvent DisplayHit(fi.Name, 0, Nothing, Nothing)
iLine = 0
iCount = 0
Dim bFirst As Boolean = True
strLine = sr.ReadLine()
While Not (strLine Is Nothing)
Dim mtch As Boolean
Dim found As String
iLine += 1
If m_vRegEx Is Nothing Then
If m_bPlaintext Then
mtch = strLine.IndexOf(m_strRegEx) >= 0
Else
'Using Regular Expressions as a real Grep
mtch = exExpression.Match(strLine).Success
End If
found = m_strRegEx
Else
For Each expr As String In m_vRegEx
If m_bPlaintext Then
mtch = strLine.IndexOf(expr) >= 0
Else
If m_bIgnoreCase = True Then
mtch = Regex.Match(strLine, expr, RegexOptions.IgnoreCase).Success
Else
mtch = Regex.Match(strLine, expr).Success
End If
End If
If mtch Then
found = expr
Exit For
End If
Next
End If
'If m_bIgnoreCase = True Then
' mtch = Regex.Match(strLine, m_strRegEx, RegexOptions.IgnoreCase)
'Else
' mtch = Regex.Match(strLine, m_strRegEx)
'End If
If mtch Then
RaiseEvent DisplayHit(fi.Name, iCount, found, strLine)
bEmpty = False
iCount += 1
If bFirst = True Then
If m_bJustFiles = True Then
strResults += CStr(enm.Current) + ControlChars.Cr + ControlChars.Lf
Exit While
Else
strResults += CStr(enm.Current) + ":" + ControlChars.Cr + ControlChars.Lf
End If
bFirst = False
End If
'Add the Line to Results string
If m_bLineNumbers = True Then
strResults &= " " & iLine & ": " & strLine & ControlChars.Cr & ControlChars.Lf
Else
strResults += " " + strLine + ControlChars.Cr + ControlChars.Lf
End If
End If
strLine = sr.ReadLine()
End While
sr.Close()
If bFirst = False Then
If m_bCountLines = True Then
strResults &= " " & iCount & " Lines Matched" & ControlChars.Cr & ControlChars.Lf
End If
strResults &= ControlChars.Cr & ControlChars.Lf
End If
Catch ex As Exception
DisplayMessage("Erreur: " & ex.Message)
End Try
End While
If bEmpty = True Then
DisplayMessage("No matches found!")
Else
DisplayMessage(strResults)
End If
End Sub 'Search
Protected Sub DisplayMessage(ByVal strMessage As String)
RaiseEvent Display(strMessage)
End Sub 'DisplayMessage
End Class 'grep
End Namespace 'grep
--
Jean-Michel Bezeau
Computer Science Analyst
|
|
|
|
 |
|
 |
I have a problem with my assignment for C. it requires us to search a word from a file by entering a character or word and displaying the word at the DOS screen. The options that i have are like :
f precedes each line with the file name in which the search string was found.
n precedes each line with its relative line number in the file.
i Ignores the case of letters pattern matching; that is upper case and lowercase in input are considered to be identical.
v Display all lines except those that match the specified string (pattern). Useful for filtering unwanted lines out of a file.
Please some one out there, plz help me out with this. If someone have the program, plz share it with me. This is going to due in next week n i haven't start anything yet .
I would really appereciate if someone can help me out in this.
Thank you;
|
|
|
|
 |
|
 |
Hello,
Nice code.
I just found what I was looking for today .
By the way I should post a modified version of the Command Line Parser because I managed, with the help of remarks from others, to reduce the size of the code to 1/3 of the original and solve a little bug. Look at the discussion threads for the article if you are curious.
But I'm happy it helped you making such a practical utility.
Cheers,
R. LOPES
Just programmer.
|
|
|
|
 |
|
 |
Nice utility!
one bug:
The directory browse dialog won't let you select a directory (instead of
a file)
one suggestion:
add a checkbox to turn off regular expression matching so if the
user just wants to do an exact text search they don't have to
escape all the characters that are part of the RE syntax.
|
|
|
|
 |
|
 |
It's worth having a look at the ngrep tool that's included as a source sample with the Rotor code. It's a grep that is both recursive and can also do replacement in the files. Here is the online help:
ngrep: Regular expression searcher and replacer based on
the System.Text.RegularExpressions.Regex class.
Syntax: ngrep [-ismqleEt?] [-r replacePattern] pattern [files [files ...]]
Options:
-? show this help
-i ignore case
-s recurse into subdirectories
-m only show filenames, not matching lines
-q quiet: only show summary data
-l lines only: don't show filename/line number. Can't be used with -m
-e show only the part of the line matching the expression, not the whole line.
-E show original and replacement expressions (with -r). Expressions are separated with " -> "
-r replace all matches with the given pattern
-t test: pretend to replace, but don't modify the file. Use with -r.
Options may occur anywhere on the command line.
If no files are specified, stdin is used.
The combination of -etr is useful for printing only part of what was found. For
example,
grep Debug\.Assert\((\w+)\) -letr $1
on
Debug.Assert(excep);
will print only
excep
--
-Blake (com/bcdev/blake)
|
|
|
|
 |
|
 |
A command line version of this would be pretty handy. I always have a console open for simple tasks, as its usually easier for me to do them by typing rather than using a mouse.
Also, I recommend using the RegularExpression classes in .NET to provide broader functionality.
|
|
|
|
 |
|
 |
I'm new to C# so bare with me. I typed the following at the command line:
csc form1.cs
I get the following errors:
Form1.cs(7,18): error CS0234: The type or namespace name 'WinForms' does not
exist in the class or namespace 'System' (are you missing an assembly
reference?)
Form1.cs(15,39): error CS0234: The type or namespace name 'WinForms' does not
exist in the class or namespace 'System' (are you missing an assembly
reference?)
...
I don't have Visual Studio so can I compile this program without it?
|
|
|
|
 |
|
 |
The article was written using a pre-release version of .NET, I will update the article soon. Meanwhile you can fix the code by using System.Windows.Forms instead System.WinForms. Cheers!
|
|
|
|
 |
|
 |
I thought that the UNIX grep supports RXs?
Since .NET has a RegularExpression class/namespace, this should be rather easy to support, shouldn't it?
"Just" replace the IndexOf by a RX query. Maybe the file must be read complete, not line-by-line, too.
--
See me: www.magerquark.de
Want a job? www.zeta-software.de/jobs
|
|
|
|
 |
|
 |
I will use your suggestions in a new version of the code! Thanks!
|
|
|
|
 |