How to Toggle String Case in .NET






3.67/5 (2 votes)
Of the 5 alternatives...somehow the obvious suggestion of dropping the ToUpper() after calling IsUpper() (or ToLower/IsLower) was missed.char.IsUpper(c) ? char.ToLower(c) : char.ToUpper(c)to become...char.IsUpper(c) ? char.ToLower(c) : cThis saves half call to a method per character "on...
Of the 5 alternatives...somehow the obvious suggestion of dropping the
Also, you should tailor the use of
Another possible way to look at this is a mapping array.
Have an array (e.g. 8 bit ASCII for a simple explanation). Preload the array with
(No, the concept of UNICODE hasn't been lost on me. If it's a matter of performance, then you can trade space and required capability for performance. If needed, then you could have the whole UNICODE alphabet mapped. Even an uppercase version of the entire UNICODE alphabet isn't that huge in the context of saving those precious cycles. You just need to decide...am I toggling a few characters...or a few billion characters.)
The algorithm then becomes. N.B. Assuming mapping array is pre-initialised array...done through a declaration so it should be most efficient to create it when the assembly is loaded.
Whilst I haven't checked this for performance, I'd find it strange if the CLI didn't optimse array access much like you can in assembly (i.e. Pointer relative addressing).
I suspect that this approach would be faster than the previous approaches (i.e. Prove me wrong ;) )
It would be fastest if you were toggling large amounts of text.
Oh, and it probably should also be a
ToUpper()
after calling IsUpper()
(or ToLower/IsLower) was missed.
char.IsUpper(c) ? char.ToLower(c) : char.ToUpper(c)to become...
char.IsUpper(c) ? char.ToLower(c) : cThis saves half call to a method per character "on average".
Also, you should tailor the use of
IsUpper
or IsLower
based on the most common input. Id imaging the IsLower() ? c : ToLower()
would probably perform faster because I'd assume that Upper case characters wouldn't occur as often as lower case characters (i.e. in English).
Another possible way to look at this is a mapping array.
Have an array (e.g. 8 bit ASCII for a simple explanation). Preload the array with
index[i] = char.ToUpper(i)
for i[0..255]. N.B. Best as a declaration and a pre-initialised array. That way, the array will be initialised at load time of the assembly and to pre-looping would be needed.
(No, the concept of UNICODE hasn't been lost on me. If it's a matter of performance, then you can trade space and required capability for performance. If needed, then you could have the whole UNICODE alphabet mapped. Even an uppercase version of the entire UNICODE alphabet isn't that huge in the context of saving those precious cycles. You just need to decide...am I toggling a few characters...or a few billion characters.)
The algorithm then becomes. N.B. Assuming mapping array is pre-initialised array...done through a declaration so it should be most efficient to create it when the assembly is loaded.
protected string ToggleCase(string s) { StringBuilder sb = new StringBuilder(s.Length) // Another Optimisation. PreAllocate size of output...should be faster foreach(char c in s) sb.Append(Mapping[c]); return sb.toString(); }This removes a Test and a method call.
Whilst I haven't checked this for performance, I'd find it strange if the CLI didn't optimse array access much like you can in assembly (i.e. Pointer relative addressing).
I suspect that this approach would be faster than the previous approaches (i.e. Prove me wrong ;) )
It would be fastest if you were toggling large amounts of text.
Oh, and it probably should also be a
static
or an extension method, that way you also escape an extra reference that gets inserted for the "this
" reference in the method call. After all, there is nothing that is related to the class instance...and it is about speed too.