Adding some notes on the string reversal and examining unsafe code (even though the original rules didn't allow for it)...
Austin Ehlers made the point of submitting an unsafe code-block to do the string reverse. Since he took the time, and even managed to argue a good point with me, I figured I'd give him some verbage on my blog. He made the point of saying, that when perf mattered forget who you stomp on. I tend to agree with this most of the time, but with the concept of playing fair still in mind I made the following changes to his initial algorithm:
- Instead of partying on the passed in copy, I now do a string.Copy and get a private copy of the string to party on. This makes the algorithm thread safe.
- I took out a bunch of pre-work he had done and changed it to inline. This sped his algorithm up enough to thwart the pointer arithmetic version I'll be using as a second example.
- There are many ways to do unsafe code, so I'm throwing in a second example that uses pointer indirection.
Why wouldn't you want to use the unsafe version? Well, it is unsafe, so you might not be able to do it within your security settings. It is also very unsafe to party on top of an allocated string, since they are supposed to be immutable and all. I'm going to argue that by making a private local copy of the string one problem with string immutability is solved and that is the multiple access problem. The second problem is the hidden bits issue, and I propose we get around this because we never CHANGE the contents of the string only the ordering. In other words, any bits that are set before we party on the string, will be set after, and the effects of us partying on the string, wouldn't have any effect to change the bits into a different state. Brian, if you want to call bullshit on this one I'll be more than happy to hear about it.
| Contributor |
Code |
Austin Ehlers Myself |
Austin propose the original algorithm in some comments. I'm posting it here in a revised form where I've sped his algorithm up by about 5-6% becaus he was being extra careful and I figured what the hell, why be careful. It isn't really less careful, we just don't do nearly as much pre-work and we get rid of a nasty div statement. I've kept his naming conventions as you'll see. private unsafe static string private unsafe static string ReverseUnsafe2(string input) {
string output = string.Copy(input);
fixed(char* sfixed=output) {
char* s=sfixed;
char t;
for(int x=0, y = output.Length - 1; x<y; x++, y--) {
t=s[x];
s[x]=s[y];
s[y]=t;
}
}
return output;
}
|
| Myself |
This is how I would have probably written the routine based on my C++ background. Using the indexer expressions is always foreign to me because I don't immediately recognize that pointer work is being done, but in effect it is (see A.5.3 of the C# Language Specification) private unsafe static string ReverseUnsafe(string input) {
string output = string.Copy(input);
fixed(char* pStr = output) {
char* pStart = pStr;
char* pEnd = pStr + (output.Length - 1);
char temp;
for(;pStart < pEnd; pStart++, pEnd--) {
temp = *pStart;
*pStart = *pEnd;
*pEnd = temp;
}
}
return output;
}
|
For some strange reason that I'm still trying to figure out, the results of the two algorithms are spurious in terms of performance output. Before the optimizations to the first algorithm, the pointer version always one, while the element access version always lost. After the updates, the element access version actually wins more often than not, which defies conventional wisdom. I'm hot to find out why this is, and I'm sure it'll be something bloggable. Anyway, these methods are insanely fast compared to the other methods for obvious reasons (unsafe code and pointer arithmetic). I'd labor to say though that the Array.Reverse method would probably be just as fast, if we were able to avoid the extra copy when creating the string from the character array (aka, if we had direct access to the character buffer of an existing string, Array.Reverse would perform at the same speed as our pointer shuffle above, because it is doing the same work).