|
It is really amazing to see crash by this code
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com
|
|
|
|
|
|
Hi,
I made another function ..thought will share this with you
/// <summary>
/// the following function replaces string strToreplace by string strByReplace in the string strValue. It is same as the .net string.replace but will do the comparision to replace case insensitively
/// </summary>
/// <param name="strValue">the string in which to replace</param>
/// <param name="strToReplace">the string which is to be replaced</param>
/// <param name="strByReplace">the string by which to replace the old value</param>
/// <returns></returns>
public static string ReplaceCaseInsensitive(string strValue, string strToReplace, string strByReplace)
{
int index = strValue.IndexOf(strToReplace, 0, StringComparison.CurrentCultureIgnoreCase);
if (index == -1)
return strValue;
else
{
strToReplace = strValue.Substring(index, strToReplace.Length);
return strValue.Replace(strToReplace, strByReplace);
}
}
}
To use call the function like this
string str = ReplaceCaseInsensitive("Abcd", "BC","++" );
Cheers
Sohail Sayed
|
|
|
|
|
That's a great alternative for some scenarios when the upper-/lower-case structure of the pattern to replace is the same throughout the original string, however it does not work for this:
ReplaceCaseInsensitive("AbcdAbCd", "BC", "++");
But, based on that idea, we could place the main functionality in a loop and do until it's completely done and then check if it's faster than the other alternatives posted here.
|
|
|
|
|
Hi,
If we are talking about pure performance, further optimizations can be done, doing bulk copies of data inside or between arrays, optimized allocation and optimized replacement (i.e. if replacing THESE with THIS, match THESE but only replace ESE with IS). Here is a logic for a simple improvement:
Run source string to find matches and build list of matching positions
IF new string is smaller than the one to be being replaced:
- Do optimized array.copy on char array of the original string and array.copy of new string
ELSE
- Only here alocate the destination char array with the correct size
- Fill the destination char array with arry.copy from original string and array.copy of the replacement string
And if you are really, really into performance, allocate the working/destination array as global variable (as many as you need) only once (yes, they must be big enough to support 99% of your strings - beleave me this pays off if you have to do a lot of repalcements to do).
See you!
|
|
|
|
|
I wrote this a while ago and with such large chunks it seems to be even faster than yours, more flexible and easier to understand ... I win with 2.56s versus 2.99s with your first example and 0.35s versus 0.47s with your last example.
You can use it also with culture-specific comparison, here is the usage for simple case insensitivity:
MyToolsClass.Replace("MyOriginalString", "Original", "Replacement", StringComparison.OrdinalIgnoreCase)
Here goes the method:
static public string Replace(string original, string pattern, string replacement, StringComparison comparisonType)
{
return Replace(original, pattern, replacement, comparisonType, -1);
}
static public string Replace(string original, string pattern, string replacement, StringComparison comparisonType, int stringBuilderInitialSize)
{
if (original == null)
{
return null;
}
if (String.IsNullOrEmpty(pattern))
{
return original;
}
int posCurrent = 0;
int lenPattern = pattern.Length;
int idxNext = original.IndexOf(pattern, comparisonType);
StringBuilder result = new StringBuilder(stringBuilderInitialSize < 0 ? Math.Min(4096, original.Length) : stringBuilderInitialSize);
while (idxNext >= 0)
{
result.Append(original, posCurrent, idxNext - posCurrent);
result.Append(replacement);
posCurrent = idxNext + lenPattern;
idxNext = original.IndexOf(pattern, posCurrent, comparisonType);
}
result.Append(original, posCurrent, original.Length - posCurrent);
return result.ToString();
}
The secret might be the overload of the StringBuilder.Append method which is used here, which allows to append a part of a string without having to create any substring from it. That might be a feature that many have overseen yet.
EDIT: Fixed bug thanks to "Member 551508". Provided overload where you can specify an initial StringBuilder size as inspired by user tmbrye. The default value is the length of the original string, but not larger than 4096. A large initial size will increase performance slightly but also allocate more memory. Also remember when specifying a large size that the result of a large string could theoretically be a very tiny or even empty string).
modified on Thursday, April 9, 2009 5:17 AM
|
|
|
|
|
Indeed you do win- Your method is freakin fast!!!!!!!!!!!!! I just used it to replace the images path on every single page that gets rendered. That's a lot of text. I didn't even notice any difference in speed with using your method or not using it all at. Thanks for sharing!
|
|
|
|
|
Beauty, I need a slight modification however as I want to replace overlapping strings in my csv string. ie ",0," -> ",," or
Replace("blah,0,0,0,00,rawr", ",0,", ",,") -> "blah,,,,00,rawr"
So I'll just change this line
Michael Epner wrote: result.Append(original, idxLast, idxPattern - idxLast);
to
<br />
if(idxLast > idxPattern)<br />
result.Remove(result.Length + idxPattern - idxLast - 1, idxLast - idxPattern);<br />
else<br />
result.Append(original, idxLast, idxPattern - idxLast);<br />
I hope it doesn't affect the performance too much.
|
|
|
|
|
That csv mod is pretty nifty- but I would separate that into a second method that leverages the original ReplaceString method. Something like ReplaceStringCsv which would then allow you to incorporate more csv functions and switches.
By the way, I ended up wrapping your ReplaceString method into a methods extention and is working out excellently in production. This is all I have to do now to use the super fast string replacer:
testStr.ReplaceString("as", "ii", StringComparison.OrdinalIgnoreCase)
...
public static class Extensions
{
static public string ReplaceString(this string original, string pattern,
string replacement, StringComparison comparisonType)
{
if (original == null)
return null;
if (String.IsNullOrEmpty(pattern))
return original;
int lenPattern = pattern.Length;
int idxPattern = -1;
int idxLast = 0;
StringBuilder result = new StringBuilder();
while (true)
{
idxPattern = original.IndexOf(pattern, idxPattern + 1, comparisonType);
if (idxPattern < 0)
{
result.Append(original, idxLast, original.Length - idxLast);
break;
}
result.Append(original, idxLast, idxPattern - idxLast);
result.Append(replacement);
idxLast = idxPattern + lenPattern;
}
return result.ToString();
}
}
|
|
|
|
|
Please see the updated version in my original post, as there was a bug.
|
|
|
|
|
I actually married this code with the original code and found it to be even faster in my testing. Here is the change I made:
Changed:
StringBuilder result = new StringBuilder();
Changed it to:
int inc = (original.Length / pattern.Length) * (replacement.Length - pattern.Length);
StringBuilder result = new StringBuilder(original.Length + Math.Max(0, inc));
|
|
|
|
|
Sure it's a good idea for better performance to initialize the StringBuilder with a starting size. Maybe it could be initialized with the original string size, or there could be an additional parameter for it in the Replace method. Calculating the size like in the example is a bit futile however, because it doesn't take into account that the pattern could appear more than once. Calculating the exact resulting size would make us need to count the occurrences which would take almost the time of building the result string, so that would be some kinda redundant approach too.
|
|
|
|
|
Thanks guys one more wheel I don't need to re-invent
|
|
|
|
|
It also has a bug. If you do Replace("abababa", "aba", "~", StringComparison.CurrentCulture) you get a Runtime error on this line: result.Append(original, idxLast, original.Length - idxLast);
Here's a tighter VB.NET version of the code with some fixes:
Public Shared Function Replace(ByVal s As String, ByVal oldValue As String, ByVal newValue As String, ByVal comparisonType As StringComparison) As String
If s Is Nothing Then Return Nothing
If String.IsNullOrEmpty(oldValue) OrElse newValue Is Nothing Then Return s
Dim result As New StringBuilder()
Dim lenOldValue As Integer = oldValue.Length
Dim curPosition As Integer = 0
Dim idxNext As Integer = s.IndexOf(oldValue, comparisonType)
While idxNext >= 0
result.Append(s, curPosition, idxNext - curPosition)
result.Append(newValue)
curPosition = idxNext + lenOldValue
idxNext = s.IndexOf(oldValue, curPosition, comparisonType)
End While
result.Append(s, curPosition, s.Length - curPosition)
Return result.ToString()
End Function
And some NUnit tests to prove the fix:
<test()> _
Public Sub TestReplace()
Assert.AreEqual("wxyz wxyz wxyz", StringHelper.Replace("asdf ASDF aSdF", "asdf", "wxyz", StringComparison.CurrentCultureIgnoreCase))
Assert.AreEqual("wxyz ASDF aSdF", StringHelper.Replace("asdf ASDF aSdF", "asdf", "wxyz", StringComparison.CurrentCulture))
Assert.IsNull(StringHelper.Replace(Nothing, "asdf", "wxyz", StringComparison.CurrentCulture))
Assert.AreEqual("", StringHelper.Replace("", "a", "b", StringComparison.CurrentCulture))
Assert.AreEqual("lmnop", StringHelper.Replace("lmnop", Nothing, Nothing, StringComparison.CurrentCulture))
Assert.AreEqual("lmnop", StringHelper.Replace("lmnop", "a", "b", StringComparison.CurrentCulture))
Assert.AreEqual("cbxzxBxzx", StringHelper.Replace("cbABABABA", "aba", "xzx", StringComparison.CurrentCultureIgnoreCase))
Assert.AreEqual("cbABABABA", StringHelper.Replace("cbABABABA", "aba", "xzx", StringComparison.CurrentCulture))
Assert.AreEqual("~b~ba", StringHelper.Replace("ababababa", "aba", "~", StringComparison.CurrentCulture))
Assert.AreEqual("~ABA~b~ABA~ba", StringHelper.Replace("ababababa", "aba", "~ABA~", StringComparison.CurrentCulture))
Assert.AreEqual("~ABA~~ABA~", StringHelper.Replace("abaaba", "aba", "~ABA~", StringComparison.CurrentCulture))
End Sub
<pre>
|
|
|
|
|
Thank you, I fixed the bug and used your modification to update the C# code in my original post.
|
|
|
|
|
Nice work. I've put it into an exention method so it can just be called on a string.
public static string Replace(this string original,
string pattern, string replacement, StringComparison comparisonType)
{
if (original == null)
{
return null;
}
if (String.IsNullOrEmpty(pattern))
{
return original;
}
int lenPattern = pattern.Length;
int idxPattern = -1;
int idxLast = 0;
StringBuilder result = new StringBuilder();
while (true)
{
idxPattern = original.IndexOf(pattern, idxPattern + 1, comparisonType);
if (idxPattern < 0)
{
result.Append(original, idxLast, original.Length - idxLast);
break;
}
result.Append(original, idxLast, idxPattern - idxLast);
result.Append(replacement);
idxLast = idxPattern + lenPattern;
}
return result.ToString();
}
|
|
|
|
|
Please see the updated version in my original post, as there was a bug (II).
|
|
|
|
|
I know I should spend the time to do this on my own, but thought I would ask for suggestions on how to modify to only replace match at end of original string. I use a regex right now, but would rather use this faster approach.
|
|
|
|
|
You mean replace only the last occurrence? That shouldn't be a big deal even with regex, however, maybe it's a tad faster something like that (untested):
int idxOldValue = original.LastIndexOf(oldValue, comparisonType);
if (idxOldValue >= 0)
{
original = String.Concat(self.Substring(0, idxOldValue), newValue, original.Substring(idxOldValue + oldValue.Length));
}
|
|
|
|
|
|
I might be doing something wrong, but there's what I observed:
1. Searching in short strings, your method seem to have an advantage.
2. With increase in size of the input string, the VB dll replace method eventually catches up and not after long becomes considerably faster.
I guess it depends on the scenario. In my line of work the token replacement usually is needed for some html/xml/whatever template processing and more often than not these templates are of size reasonable enough to be loaded whole in memory. Here the "VB replace" method seems to clearly outperform the proposed one.
|
|
|
|
|
Michael, I like your code and would like to know if you are OK with me using it without restriction, for example, under MIT licensing? If not, what licensing applies.
|
|
|
|
|
yes sure, you can use it, go ahead.
|
|
|
|
|
programa,koqto namira v daden tekst nai 4esto sre6tanata duma i q preobrazuva kym glavni bukvi.Ako ima nqkolko nai sle6tani dumi da se obrabotqt vsi4ki.Razdelitel e vsi4ko koeto ne e duma.Ne pravi razlika golemi-malki bukvi.vhodyt se 4ete ot file problem3.txt i se izvejda na konzolata.
|
|
|
|
|
Pardon my ignorance...
But, whats wrong with what I do currently?
public static string Replace(string expression, string oldValue, string newValue, bool caseSensitive)
{
string replaced = "";
if (caseSensitive == true)
{
replaced = Regex.Replace(expression, oldValue, newValue, RegexOptions.None);
}
else
{
replaced = Regex.Replace(expression, oldValue, newValue, RegexOptions.IgnoreCase);
}
return replaced;
}
Thanks,
HyperX.
|
|
|
|
|