Click here to Skip to main content
Click here to Skip to main content
Alternative Tip

Counting lines in a string

, 28 Feb 2012
Rate this:
Please Sign up or sign in to vote.
Great analysis!I found out that Regex can be accelerated by a factor of about two.Instead of new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);you can speed up by using:new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);But admittedly, nothing beats...
Great analysis!
 
I found out that Regex can be accelerated by a factor of about two.
 
Instead of
new Regex(@"\n", RegexOptions.Compiled|RegexOptions.Multiline);
 
you can speed up by using:
new Regex(@"^.*?$", RegexOptions.Compiled|RegexOptions.Multiline);
 
But admittedly, nothing beats the native methods (IndexOf).
 
[EDIT]
My statement above is wrong: I did compare "$" (and not "\n") against "^.*?".
The measurments show that "\n" is the fastest of all Regex matches, while "$" is the slowest (5 times slower than "\n"...!).
That's a real surprise to me.
 
The comparison:

Regex Match[ms] for 2.500.000 linesRegexOptions
\n1847Compiled|Singleline
\n1851Compiled|Multiline
^.*$2282Compiled|Multiline
^.*?$5327Compiled|Multiline
$10100Compiled|Multiline
 
As a comparison: IndexOf('\n') only takes 237 [ms].
 
[/EDIT]

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Andreas Gieriet
Founder eXternSoft GmbH
Switzerland Switzerland
I feel comfortable on a variety of systems (UNIX, Windows, cross-compiled embedded systems, etc.) in a variety of languages, environments, and tools.
I have a particular affinity to computer language analysis, testing, as well as quality management.
 
More information about what I do for a living can be found at my LinkedIn Profile and on my company's web page (German only).
Follow on   LinkedIn

Comments and Discussions

 
GeneralRe: So Regex("^.*?$") is not faster than Regex("\n"), as you ori... PinmemberRonald M. Martin29-Feb-12 7:06 
GeneralRe: Ah, I see your initial question. "*" is greedy match (match ... PinmemberAndreas Gieriet28-Feb-12 16:48 
GeneralRe: Let me rephrase my question. Assuming that your syntax (@"^.... PinmemberRonald M. Martin28-Feb-12 3:50 
GeneralI don't understand the use of the question mark (?) in this ... PinmemberRonald M. Martin27-Feb-12 17:27 
GeneralRe: I simple measured a difference of a factor of about two. No ... PinmemberAndreas Gieriet27-Feb-12 21:21 
I simple measured a difference of a factor of about two.
No idea why.
Obviously behaves the Regex differently when you serach for *all* occurances of a single character (in this case the '\n') compared to searching all lines (^.*?$).
 
As said, no clue why, I found it by chance only.
GeneralCan anyone please explain why LinesCount2 is so slow? I thou... PinmemberMiller426-Feb-12 6:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140721.1 | Last Updated 28 Feb 2012
Article Copyright 2012 by Andreas Gieriet
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid