Hello people,
In this post I’d like to share one topic that is very important mainly for those which uses optimization in their C compiler. It’s very known that the C compiler was created having in mind to be a powerful language and able to generate efficient code.
So, the C standard leaves certain freedom to compilers programmers to decided what to do in some situations to be able to generate more efficient code since not all architecture has the same instruction set. I’ll show an example to give you an idea of what we are talking about. I won’t (re)write much more since in the end I’ll list some links where you can find very useful information and of courses several others examples.
Division by zero:
This is the classic. But wait, it’s not only write like this and we are right?
int division_dumb(int a, int b) { return a/b; }
Not really (of course this example is too simple). Let’s start explaining that x86 architecture triggers a hardware exception when we try to divide by zero whereas PowerPC just ignore it. This is an example of undefined behavior for the C language. In this way we allow the C compiler to choose what to do to have best performance for the target.
Ok, what can happen now?
1) In PowerPC, nothing. As discussed, PowerPC just ignores and doesn’t raise an exception as in x86.
2) The compiler can return zero, or something else.
3) When using optimization, your code can be optimized away (this won’t happen in this dumb example, we’ll see something further). This is a huge problem, for debugging and for portability.
Lets see another real case which happened in PostgreeSQL (taken from one of the articles)
if (arg2 == 0) ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),errmsg("division by zero"))); /* No overflow is possible */ PG_RETURN_INT32((int32) arg1 / arg2);
In this example the message “division by zero” will never appear even if arg2 = 0. Why? The compiler thinks that if arg2 == 0, the division in PG_RETURN_INT32 will always result in an undefined behaviour. So he decided to move the function PG_RETURN_INT32 before the if and as we never should receive arg2 == 0, the compiler just ignored ereport function. Remember, the compiler always try to optimize. In this case we have one “if” decision less to do and one less function call to do as well.
The problem is that the function ereport once called, never returns, but this information was not given to the compiler and then all the mess was already done!
There are others “cools” examples like signed int overflow, logical shift, dereference null pointer etc. It’s very important to understand these concepts to avoid them and also to be able to figure out some “strange” bugs that happens after optimization is turn on or compiler is changed!
Here from where I learn a little more:
Undefined Behavior: What Happened to My Code?
What Every C Programmer Should Know About Undefined Behavior (3 parts)
A Guide to Undefined Behavior in C and C++ (3 parts)
That’s all folks!
interesting post! something to have always in mind when optimizing….