So, stories about undefined behavior have been making rounds again in my Twitter and RSS feeds (two things I was supposed not to be using, but anyway), which brought me some new thoughts and some other thoughts I meant to blog about ages ago but forgot about them.
The most recent one was this comment on Hacker News (via @pcwalton, via @jamesiry), which presents the following code, which is supposed to take a circular linked list, take note of the head of the list, and walk around the list freeing each node until it finds the node that points back to the head (and thus the end of the list):
void free_circularly_linked_list(struct node *head) { struct node *tmp = head; do { struct node *next = tmp->next; free(tmp); tmp = next; } while (tmp != head); }
This looks (to me) as good C code as it gets. However, this code triggers undefined behavior: after the first iteration of the loop frees the node pointed to by head, it is undefined behavior to perform the tmp != head comparison, even though head is not dereferenced.
I don't know what is the rationale behind this. Maybe that would make it possible to run C in a garbage-collected environment where as soon as an object is freed, all references to it are zeroed out. (The fact that no one has ever done this (as far as I know) is mere detail. The fact that in a garbage-collected environment free would likely be a no-op is a mere detail too.)
The feeling I had after I read this is that C is in a kind of identity crisis: C allows you to do all sorts of unsafe operations because (I'd assume) it's supposed to let you do the kind of bit-bashing you often want to do in low-level code; at the same time, modern standards forbid that very bit-bashing. What is the point of programming in C anymore?
[Addendum: To be more clear, what is the purported goal of the C language? The feeling I have is that it has moved from its original function as a "higher-level assembly" that is good for systems programming, and is trying to serve a wider audience more preoccupied with performance, but in doing so it is not serving either audience very well.]
And the standards forbid these operations in the worst possible way: by claiming that the behavior is undefined, i.e., claiming that compilers are free to do whatever the hell they please with code perfoming such operations. Compilers keep becoming better and better at exploiting this sort of undefinedness to better "optimize" code (for speed, anyway). Meanwhile, they keep breaking existing code, and opening new and shiny security vulnerabilities in programs. The NSA probably loves this.
At this point, I'm beginning to think that C does not serve its purpose well anymore. The problem is that there seems to be no real alternative available. Maybe Rust can be it, although I don't really know how well Rust does in the bit-twiddling camp (e.g., can you easily perform bitwise and/or with a pointer in Rust? Well, come to think of it, even C does not allow that; you have to cast to an integer first.)
* * *
The other undefined behavior I've been reading about lately is signed overflow. In C, signed overflow is undefined, which means that code like:
if (value + increment < value) { printf("Overflow occurred! Aborting!\n"); exit(1); } else { printf("No overflow; proceeding normally\n"); value += increment; }
is broken, because the compiler is likely to optimize the overflow check and the then branch away and just leave the else branch. I have seen two rationales given for that:
Pointer arithmetic. In the good old times, an int and a pointer used to have the same size. People happily used ints as array indices. Array indexing is just pointer arithmetic, and in some architectures (like x86), you can often perform the pointer arithmetic plus load in a single instruction.
Then came 64-bit architectures. For reasons I don't really get (compatibility?), on x86-64 and other 64-bit architectures ints remained 32-bit even though pointers became 64-bit. The problem now is that transformations that assumed integers and pointers to be the same size don't work anymore, because now their point of overflow is different. For example, suppose you had code like:
void walk_string(char *s) { for (int i=0; s[i]; i++) { do_something(s[i]); } }
Usually, the compiler would be able to replace this with:
void walk_string(char *s) { for (; *s; s++) { do_something(*s); } }
which is potentially more efficient. If ints and pointers have the same size, then this transformation is okay regardless of overflow, because the int would only overflow at the same point the pointer would anyway. Now, if ints are supposed to wrap at 32-bits but pointers wrap at 64-bits, then this transformation is not valid anymore, because the pointer version does not preserve the overflow behavior of the original. By making signed overflow undefined, the problem is sidestepped entirely, because now at the point of overflow the compiler is free to do whatever the hell it pleases, so the fact that the overflow behavior of the original is not preserved does not matter.
Now, there is a number of things wrong in this scenario:
Now ints cannot be used to index into an arbitrarily-sized array, only 232-element ones at most. In any sane language (I'd argue), the standard integer type would be one encompassing the full range of valid array indices. (Common Lisp explicitly requires fixnums to have at least that range, for instance.)
Unsigned overflow is defined. That's good, but at the same time it seems kinda strange that we have been using signed integers for array indices all this time, to the point that the undefinedness was chosen precisely to enable optimization of this usage, even though, come to think of it, it seems kinda wrong to use a signed integer in this circumstance (if your array takes more than half the memory space, this would be wrong, for instance).
Optimizations based on "real math". The other reason I am aware of for making signed overflow undefined is to enable optimizations based on the mathematical properties of actual mathematical integers. An example is assuming that x+1 > x, for instance (which is what breaks the overflow test mentioned before). Another example is assuming that in a loop like:
for (i=0; i<=limit; i++) { ... }
the halting condition i<=limit will eventually be true, and therefore the loop will finish; if i were defined to overflow, then this loop would be infinite when limit == INT_MAX. Knowing that a loop terminates enables some optimizations. The linked article mentions enabling use of loop-specific instructions which assume termination in some architectures. Another advantage of knowing that a loop terminates is enabling moving code around, because non-termination is an externally-visible effect and you may not be able to move code across the boundaries of an externally-visible event [please clarify]. Now, something that occurred to me back when I read that post is that it assumes a dichotomy between either treating overflow as undefined, or defining it to wrap around. But there are other possibilities not explored here. I don't necessarily claim that they are feasible or better, but it's interesting to think on what optimizations they would enable or preclude. For instance:
One possibility is to actually make integers behave like their mathematical counterparts. This is what Common Lisp and Python do, for instance, by automatically promoting fixnums to bignums when they exceed the representation capacity of fixnums. In this case, all optimizations based on actual arithmetic are not only possible but safe, because the language integers actually respect those properties. The drawback is that now you have to deal with bignums, and check whether a number is a fixnum or a bignum before performing arithmetic.
Another possibility is to make integer overflow trap, i.e., raise an exception, abort execution, or something along those lines. In this case, an assumption like x+1 > x is true as long as x+1 does not trap. This might not enable the same amount of optimization as claiming undefined behavior for overflow, but at least it enables knowing that a loop terminates (either by hitting the termination condition or by aborting on overflow). Although aborting on overflow is an externally-visible event too, so I don't know if you would be able to perform the aforementioned code moving. Actually, I'd need to see an example of code moving that would be benefited by knowing about termination. I need to research this. [I rememeber seeing some code involving multiple threads and the observable behavior of one thread from another, but (1) I'd have to find it; (2) I'd have to be convinced that the mentioned optimization would be a good idea anyway.]
Now, the problem with the trapping semantics is that you have to check for overflow on every operation. This could be costly, but there are people working on making it more efficient. Seriously, this is the kind of thing that would be trivial if only architectures would help a little bit. Having the processor trap on overflow (either by having special trapping arithmetic instructions, or by having a special mode/flag which would enable trapping) would make this essentially costless, I think. Another nice-to-have would be a set of arithmetic instructions which treated the lower bits of words specially as flags, and trapped when the flags were not, say, all zeros. This could drastically reduce the cost of having fixnums and bignums in the language; the instructions would trap on non-fixnums and invoke a handler to perform the bignum arithmetic (or even find out that the operands are not numbers at all, and signal a dynamic type error), and perform the fast integer arithmetic when the operands had the fixnum flag. Alas, unfortunately we cannot just invent our own instructions, as we typically want to be able to use our programming languages on existing platforms. We could try to lobby Intel/AMD, though (as if).
(At this point I'm not really thinking about semantics for C anymore, just about the possible semantics for integers in a new programming language. Even if, say, Clang incorporated an efficient mechanism for trapping integer overflow, the standard would still say that signed overflow is undefined, so I'm not sure there is much hope for C in this situation.)
@Marcus: The problem is it's gradually becoming "C with this exact compiler *version*", as all compiler definedness gradually erodes with each successive version. :P
Indeed, there are many processor features that are not directly available in C even though they are available in a wide variety of processors (the overflow flag is a prime example). A similar situation that comes to my mind now is that 32bit*32bit multiplication results in a 32-bit integer in C even though most processors yield a 64-bit result (though in this case you can just cast the operands and hope the compiler will figure out that the cast is there just to force using the 64-bit result, I guess).
But at least the machine operations that C _does_ expose used to work reliably as you would expect from their corresponding machine-level operations. But then people got jealous that Fortran performed better than C for numerical software because of Fortran's stronger aliasing guarantees, and people decided "hey, if we just declare that stuff can't alias we can do much better!", and from there on it was only decadence. :P
I guess part of what was/is going on is that undefined behavior had two different intents: there was UB meant to enable optimizations (like strict aliasing), and UB of the form "this may not work in some architectures, or may produce unpredictable results". Of course, semantically they are the same thing (no meaning is assigned to a certain construct, therefore anything can happen), but I guess the original intent of many instances of UB was to allow architecture/implementation variability, not to give carte blanche for the compiler to do anything. But since they are semantically the same thing, compilers began to exploit all forms of UB for optimization in the same way.
I had more things to say but I haven't really thought them out properly, so I'll leave that for another discussion.
Copyright © 2010-2024 Vítor De Araújo
O conteúdo deste blog, a menos que de outra forma especificado, pode ser utilizado segundo os termos da licença Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
Powered by Blognir.
Marcus Aurelius, 2016-05-20 11:17:21 -0300 #
“At this point, I'm beginning to think that C does not serve its purpose well anymore. The problem is that there seems to be no real alternative available.”
I guess the current alternative to standard-C that people use is C-with-this-compiler-and-exactly-these-command-line-options... Standard C was never very good at exposing the hardware anyway, its advantage was that it simply didn't care very much and allowed you to do things with little overhead. It always annoyed me that processors had all those cool flags (overflow, carry, borrow, floating point rounding modes) but C didn't expose them. So much for portable assembly... https://marcuscf.wordpress.com/2008/02/13/static_castdouble_var/
Unfortunately, sticking to a single compiler does not scale very well in a world with tools that are constantly moving forward, and kind of defeats the point of having a standard...