Increment differences in C++ and C#
A friend was a bit surprised today to find that the postfix increment operator (i++) doesn’t always work exactly the same in both C++ and C#. I’m not a language lawyer and this is the kind of thing I usually file under “you shouldn’t be doing that in the first place” but I thought this might serve as a good example.
Here is the offending (offensive) code:
int i = 1;
i = i++;
What is the value of ‘i’ after this code has run? Don’t ask your compiler. Instead, try and figure it out based on your understanding of the grammar.
The reason I put this into the category of “you shouldn’t be doing that in the first place” is because it’s dangerous to write expressions that include operands with side effects. The problem is that the order in which the operands of individual operators are evaluated is undefined. As with many of the things that are left “undefined” in C and C++ this is to allow compilers to optimize the code without unnecessary constraints. I would then argue that this isn’t really a difference between the C++ and C# languages. It just happens to be a difference in the undefined behavior from different compiler implementations.
So as general rule, you should consider the result of expressions where a value is modified more than once to be undefined. Can you think of a case where this is not true? Why the comma operator of course! The lesser known comma operator is always evaluated left-to-right.
i = 1, ++i, i++;
In this example ‘i’ becomes 3. Of course unlike most of the other C++ operators, C# didn’t inherit the comma operator.
Now let’s go back to the original example:
int i = 1;
i = i++;
Although the results are undefined, the results can be interesting (in a useless sort of way) as you examine the different compilers. The reason I’m still talking about this is because some people like to think the results of one compiler are somehow “better” than the results of another compiler. Let’s take a quick look at the difference between the Visual C++ and Visual C# compilers and you will realize that the results, although different, are equally meaningless. Here I use MSIL as a common medium for discussion.
The Visual C# compiler basically follows the following logic:
Instructions Stack Variable
1
ldloc i 1 1
dup 1, 1 1
ldc.i4.1 1, 1, 1 1
add 1, 2 1
stloc i 1 2
stloc i 1
The compiler pushes the value of ‘i’ onto the stack and duplicates it so that the previous value can be retrieved. It then continues to increment the variable by pushing the constant 1 onto the stack and adds the values at the top of the stack. It now pops the result off stack and writes it back to ‘i’ as the result of the increment. Finally it returns the previous value as the result of the assignment operator not realizing that this is also referring to the same variable and write the previous value of ‘i’ back to ‘i’. Boy that was a lot of instructions for nothing.
The Visual C++ compiler on the other hand goes about things a little different:
Instructions Stack Variable
1
ldloc i 1 1
stloc i 1
ldloc i 1 1
ldc.i4.1 1, 1 1
add 2 1
stloc i 2
The compiler pushes the value of ‘i’ onto the stack and then assigns it to ‘i’ by popping it off the stack not realizing that it’s the same variable. It then continues to increment the variable by pushing the value of ‘i’ onto the stack again followed by the constant 1 and adds the values. Finally it pops the value off the stack and writes it back to ‘i’ as the result of the increment.
At the end of the day the C# compiler results in a value of 1 and the C++ compiler results in a value of 2. Neither is right, neither is wrong and both are undefined.
We need to give the Visual C++ compiler credit. It is so focused on optimizing the code it cuts to the chase and produces the following for optimized (release) builds:
int i = 2;
What’s the moral of the story? Don’t rely on undefined behavior.
© 2006 Kenny Kerr