What to do with unclear standards?

Post by **agner** » 2018-02-25, 10:30:39

The C++ standard has many cases of 'undefined' behavior which can have quite bad consequences. The ForwardCom design should avoid this and behave in a well-defined and predictable way as far as possible. Here I will propose some solutions to the undefined and unclear situations:

1. Signed integer overflow
This is a nasty one. I have seen the gcc compiler optimize away an overflow check, which it is actually allowed to do with the most pedantic interpretation of 'undefined'. ForwardCom will, of course, wrap around in case of integer overflow as every computer with 2's complement representation does. In addition, there is an option bit for generating a trap (software interrupt) in case of overflow. The abs instruction has option bits for deciding the result of abs(INT_MIN). There are also optional instructions for saturated arithmetics.

2. Array index out of bounds
This is a common bug with serious consequences. ForwardCom has an optional addressing mode with boundary check which can be used for arrays if the size is known at compile time.

3.Signed integer shift right
A signed right shift of a negative integer is undefined. This should of course use sign-extension as most computers do, e.g.: -4 >> 1 = -2.

4. Integer shift out of range
Shifting an integer right or left by a count that is out of range gives undefined behavior. For example the logical result of 1 << 0x101 would be 0 because the 1-bit is shifted way out. But x86 processors take the shift count modulo the number of bits, so that 1 << 0x101 = 2.
I would prefer the former standard for ForwardCom, i.e. zero for an unsigned or positive number shifted by a count that is out of range. A negative number shifted right in a signed operation should give -1 if the shift count is out of range.

5. Bit scan of zero
A "bit scan forward" instruction gives the index to the least significant bit that is 1. "Bit scan reverse" gives the index to the most significant bit that is 1. The behavior is undefined if the input is zero. The x86 instructions BSF and BSR leave the destination register unchanged if the input is zero, but it is officially undefined. A later modification, LZCNT, is the same as BSR but gives the operand size if the input is zero. My suggestion for ForwardCom is to return -1 if the input is zero. It is inconvenient to have a result that depends on the operand size.

6. NAN + NAN
A floating point NAN (not a number) contains payload bits which can contain information about the kind of error that caused the NAN. This payload is propagated through the subsequent calculations so that the source of error can be traced in the final result. This is a nice feature defined in the IEEE 754 floating point standard. This feature is supported in most processors but hardly ever used. I think NAN propagation will be very useful in ForwardCom where we have vectors with variable size. Assume, for example, that we have two arrays: float a[8], b[8]; and we want to calculate a/b in a loop. Elements number 0 and 4 in both arrays are zero. If we use interrupts to detect 0/0 then we will have two interrupts if the vector length is 4 elements, but only one interrupt if the vector length is 8 elements. If we use NAN propagation instead to trace the errors then we have consistent results regardless of the vector length. However, there are some deficiencies in this standard. The standard says that when two NAN's are combined, the result will be one of them. For example, most processors give NAN1 + NAN2 = NAN1. This is unfortunate because a+b and b+a don't give the same result. The compiler can swap the operands so the result may be different for different compilers or different optimization options. My proposal is to return the bitwise OR combination of the two NAN's. If different bits in the NAN payload indicate different error conditions then the result will indicate if there are multiple error conditions. This feature can be turned off when strict conformance to the standard is needed. We should define different payload bits for different error conditions, such as 0/0, ∞ - ∞ , sqrt(-1), log(-1), asin(-2), etc. This has not been done, as far as I know.

7. min and max with a NAN input
The IEEE 754 standard says that the max and min functions with one NAN and one finite input will return the finite input. This is unfortunate when we want NAN propagation. ForwardCom will return the NAN value. This feature can be turned off when strict conformance to the standard is needed.

Are there other ambiguous situations that I have forgotten? Are there any computer systems that I am not aware of that have tried to fix some of these ambiguities and perhaps proposed improved standards?

HubertLamontagne · Post by **HubertLamontagne** » 2018-02-26, 16:13:21

Off the top of my head :

Divide by 0 exceptions: There are two camps on this one... "Make it fail as hard as possible" and "Crashing the whole app does way more damage than the numerically crazy values it guards against". Generally it's the server backend people (where your database is untouched and the fail stops the errant program from doing more damage) vs the video-game people (where the player loses any unsaved progression which makes him angry).

Int to float cast truncates downwards in the positive range, but upwards in the negative range: I don't think this one can be fixed. Ideal behavior would probably be floor(x + 0.5) or floor(x). I know some people would like banker's rounding or such, but for my kind of applications it's better if n+0.5 always rounds in the same direction.

Integer division rounds downwards in the positives but upwards in the negative: This one is another one that can go both ways. Making it round downwards always would generate some unexpected results like -1 / 8 = -1 (instead of 0 like in the positive range), but it would let the compiler optimize /2, /4, /8 etc into >>1, >>2, >>3...

Integer shift by negative amounts: Arguably you would want >> -1 to behave as << 1, and << -1 to behave as >> 1 (which would make arithmetic and logical shift left different).

Negative of int_min is int_min: This is similar to the abs(int_min) = int_min case. In practice, it's pretty rare to run into this problem, and the standard handling is okay, but just for sake of completeness...

HubertLamontagne · Post by **HubertLamontagne** » 2018-02-27, 16:26:40

1. Signed integer overflow:
https://kristerw.blogspot.ca/2016/02/ho ... ables.html explains why the do the "exploit" (aka exploit a flaw in the C/C++ specs that was left because of some archaic 36bit-honeywell and ibm-fault-on-overflow hardware) of assuming that ints can be supersized willy nilly by the compiler. Some of the points are valid - in particular, the point that it's undefined overflow that allows the assumption that a and a[i+1] are adjacent in 64bits, and potentially the inferential-propagation optimizations on loop tests (for(int i=0; i<limit; i+=stride) can assume that i isn't going to wrap). Though a lot of the suggested optimizations look pretty useless to me.

I'm not sure how the standard should be rewritten, but my feeling is that the compiler should assume that ints can be infinitely upgraded in size without change in behavior (which allows for loop test optimizations and promotion to 64-bit for indexing), but not that they produce a "poison" value that recursively overrides tests downstream and upstream. And possibly with an exception for >> so that x = x >> 31 works (equivalent to x = x >= 0 ? 0 : -1, useful for masks).

3. Signed integer shift right
This one looks like a left over from the possibility of running C code on an archaic machine using 1's complement for negative. (which would evaluate -3>>1 as -1 instead of -2 like any modern platform).

4. Integer shift out of range
Apparently some guy thought it would be a good idea to code GCC to assume that this produces a "poison" value, which propagates backwards into GCC's syntax. This one is not a good idea because I'm pretty sure it doesn't lead to any real speed gains and is only a potential bug source.

---

Another one that was very controversial on introduction but now seems to be mostly accepted is strict aliasing (letting the compiler assume that a float* pointer and an int* pointer never overlap so that it can reorder the memory accesses).

marioxcc · Post by **marioxcc** » 2018-06-05, 15:15:11

HubertLamontagne wrote: ↑2018-02-27, 16:26:40 1. Signed integer overflow:
https://kristerw.blogspot.ca/2016/02/ho ... ables.html explains why the do the "exploit" (aka exploit a flaw in the C/C++ specs that was left because of some archaic 36bit-honeywell and ibm-fault-on-overflow hardware) of assuming that ints can be supersized willy nilly by the compiler. Some of the points are valid - in particular, the point that it's undefined overflow that allows the assumption that a[ i] and a[i+1] are adjacent in 64bits, and potentially the inferential-propagation optimizations on loop tests (for(int i=0; i<limit; i+=stride) can assume that i isn't going to wrap). Though a lot of the suggested optimizations look pretty useless to me.

I'm not sure how the standard should be rewritten, but my feeling is that the compiler should assume that ints can be infinitely upgraded in size without change in behavior (which allows for loop test optimizations and promotion to 64-bit for indexing), but not that they produce a "poison" value that recursively overrides tests downstream and upstream. And possibly with an exception for >> so that x = x >> 31 works (equivalent to x = x >= 0 ? 0 : -1, useful for masks).

3. Signed integer shift right
This one looks like a left over from the possibility of running C code on an archaic machine using 1's complement for negative. (which would evaluate -3>>1 as -1 instead of -2 like any modern platform).

4. Integer shift out of range
Apparently some guy thought it would be a good idea to code GCC to assume that this produces a "poison" value, which propagates backwards into GCC's syntax. This one is not a good idea because I'm pretty sure it doesn't lead to any real speed gains and is only a potential bug source.

---

Another one that was very controversial on introduction but now seems to be mostly accepted is strict aliasing (letting the compiler assume that a float* pointer and an int* pointer never overlap so that it can reorder the memory accesses).

I disagree with the implicit premise of this article. C is not just abstracted assembly. It is not a convenient abstraction for the instructions of your machine. It is a programming language for a very abstract machine. When you write “a + b”, you are performing that operation in the abstract machine, which (for “int” and other signed, but not for int32_t) has undefined behavior for overflow. Likewise, bit shifts are not the bit shift instructions of your machine, but of this abstract machine. There is no such thing as “the compiler broke my program”. Either your program was broken to begin with, and it invoked undefined behavior that happened to match what you wanted, or the compiler has errata. But people rarely program in C. We see people all the time using “fno-strict-aliasing” or assuming that integer division by 0 will result in a trap; that is no longer C.

HubertLamontagne · Post by **HubertLamontagne** » 2018-06-07, 19:21:05

marioxcc wrote: ↑2018-06-05, 15:15:11 I disagree with the implicit premise of this article. C is not just abstracted assembly. It is not a convenient abstraction for the instructions of your machine. It is a programming language for a very abstract machine. When you write “a + b”, you are performing that operation in the abstract machine, which (for “int” and other signed, but not for int32_t) has undefined behavior for overflow. Likewise, bit shifts are not the bit shift instructions of your machine, but of this abstract machine. There is no such thing as “the compiler broke my program”. Either your program was broken to begin with, and it invoked undefined behavior that happened to match what you wanted, or the compiler has errata. But people rarely program in C. We see people all the time using “fno-strict-aliasing” or assuming that integer division by 0 will result in a trap; that is no longer C.

Yes and no... C has evolved into an abstracted assembly for the collective group of "general purpose 32bit processors" (and their 64bit upgrades), generally covering MIPS and all its clones such as DEC Alpha and Risc V and PA-RISC, x86, ARM, 68k, SPARC, POWER, SuperH.

This locks in a lot of assumptions:
- 8-bit byte
- Byte addressing
- The main integer type is stored to memory in 32bits (including in 64bit mode)
- Pointers are 32/64bits
- No integer overflow faults
- 2's complement
- IEEE floating point
- Struct members are stored consecutively, with padding for alignment
- Memory loads and stores happen in a well defined order on a definite number of bytes (this one has an extremely large effect in terms of possible optimizations)

I'm sure if you look hard enough, you can find architectures that violate these (DSPs and weird IBM mainframes that use EBCDIC), but they are generally rare as hen's teeth. In my view, this "practical C machine" is much less abstract, and there is tons of code that targets it.

marioxcc · Post by **marioxcc** » 2018-07-17, 16:01:09

HubertLamontagne wrote: ↑2018-06-07, 19:21:05 Yes and no... C has evolved into an abstracted assembly for the collective group of "general purpose 32bit processors" (and their 64bit upgrades), generally covering MIPS and all its clones such as DEC Alpha and Risc V and PA-RISC, x86, ARM, 68k, SPARC, POWER, SuperH. [...]

This locks in a lot of assumptions: [...]

Incorrect. That is what run of the mill programmers think that C is, but they are wrong. C is what I described: A programming language for a highly abstract machine whose constructs are not representations of instructions in the machine ISA. Hence that “a + b” with “a” and “b” of type “int” should not be assumed to map to an addition operation, because the abstract machine has very different semantics than say, x86 ADD. x86 ADD uses modulo arithmetic and is a total function. C’s addition of integers is a partial function that is undefined when the result would overflow. Likewise, pointer operations do not correspond at all to memory access constructs in the machine ISA. The memory of the abstract machine of C remembers types and reading a float as an int is not allowed in the C abstract machine. The fact that x86 allows one to write an integer and then read it back as a float is utterly irrelevant when programming in C. Again, you are programming the C abstract machine, not x86, not MIPS, and not whatever architecture your machine uses.

People who want an “abstracted assembly” should not use C, for C is simply not that. Using a language while pretending its semantics are entirely different than what they really are is bound to result in catastrophic failure (and in practice, it does). This is as absurd as using a screwdriver as if it was a hammer.

HubertLamontagne · Post by **HubertLamontagne** » 2018-07-18, 5:21:09

marioxcc wrote: ↑2018-07-17, 16:01:09 Incorrect. That is what run of the mill programmers think that C is, but they are wrong. C is what I described: A programming language for a highly abstract machine whose constructs are not representations of instructions in the machine ISA. Hence that “a + b” with “a” and “b” of type “int” should not be assumed to map to an addition operation, because the abstract machine has very different semantics than say, x86 ADD. x86 ADD uses modulo arithmetic and is a total function. C’s addition of integers is a partial function that is undefined when the result would overflow. Likewise, pointer operations do not correspond at all to memory access constructs in the machine ISA. The memory of the abstract machine of C remembers types and reading a float as an int is not allowed in the C abstract machine. The fact that x86 allows one to write an integer and then read it back as a float is utterly irrelevant when programming in C. Again, you are programming the C abstract machine, not x86, not MIPS, and not whatever architecture your machine uses.

People who want an “abstracted assembly” should not use C, for C is simply not that. Using a language while pretending its semantics are entirely different than what they really are is bound to result in catastrophic failure (and in practice, it does). This is as absurd as using a screwdriver as if it was a hammer.

Mhmm... Question: is there any reason for C to be this highly abstract machine, rather than the real world "sum-of-x86-ARM-MIPS-PPC-68k-SPARC-ALPHA-ITANIUM-PARISC-SUPERH-32/64bit-compiled-on-GCC/LLVM/MSVC/ICC" de-facto architecture (for lack of a better term), OTHER than for compiler optimizations and for targeting extremely rare architectures?

csdt · Post by **csdt** » 2018-07-18, 9:06:56

HubertLamontagne wrote: ↑2018-07-18, 5:21:09 Mhmm... Question: is there any reason for C to be this highly abstract machine, rather than the real world "sum-of-x86-ARM-MIPS-PPC-68k-SPARC-ALPHA-ITANIUM-PARISC-SUPERH-32/64bit-compiled-on-GCC/LLVM/MSVC/ICC" de-facto architecture (for lack of a better term), OTHER than for compiler optimizations and for targeting extremely rare architectures?

I would say your real-world "sum-of-x86-ARM-MIPS-PPC-68k-SPARC-ALPHA-ITANIUM-PARISC-SUPERH-32/64bit-compiled-on-GCC/LLVM/MSVC/ICC" has already so different behaviors that you cannot expect much more than what C is providing.

The simplest example I can think of is the sign of `char`: for C, it is unspecified, on x86 it is signed and on arm/power it is unsigned.
The signedness of `char` depends on the fastest way to handle those with the architecture (might not be relevant anymore as architectures evolved).

Basically, to be as fast as one can expect, C was designed to not constraint the sign of `char`.
Actually, the whole C standard is designed like that: being more abstract to be more portable (and future proof).

The undefined signed overflow, is also one of them, but might not be relevant anymore: the assumption was that not all architectures were encoding signed with 2-complement, so the result of signed overflow couldn't be specify further.
Now, the undefinedness of the signed overflow is used by compilers to optimize loops (llvm post about undefined behaviors: http://blog.llvm.org/2011/05/what-every ... -know.html).

agner wrote: ↑2018-02-25, 10:30:39 The C++ standard has many cases of 'undefined' behavior which can have quite bad consequences. The ForwardCom design should avoid this and behave in a well-defined and predictable way as far as possible. Here I will propose some solutions to the undefined and unclear situations:

Going back the Agner's first post, there is a distinction between language undefined behaviors and architecture undefined behaviors.
As far as I understand, architecture undefined behaviors should be as few as possible. But language undefined/unspecified behaviors are just here to be more generic and to support efficiently more architectures. Those should not affect how you design an architecture.

agner wrote: ↑2018-02-25, 10:30:39 6. NAN + NAN
A floating point NAN (not a number) contains payload bits which can contain information about the kind of error that caused the NAN. This payload is propagated through the subsequent calculations so that the source of error can be traced in the final result. This is a nice feature defined in the IEEE 754 floating point standard. This feature is supported in most processors but hardly ever used. I think NAN propagation will be very useful in ForwardCom where we have vectors with variable size. Assume, for example, that we have two arrays: float a[8], b[8]; and we want to calculate a/b in a loop. Elements number 0 and 4 in both arrays are zero. If we use interrupts to detect 0/0 then we will have two interrupts if the vector length is 4 elements, but only one interrupt if the vector length is 8 elements. If we use NAN propagation instead to trace the errors then we have consistent results regardless of the vector length. However, there are some deficiencies in this standard. The standard says that when two NAN's are combined, the result will be one of them. For example, most processors give NAN1 + NAN2 = NAN1. This is unfortunate because a+b and b+a don't give the same result. The compiler can swap the operands so the result may be different for different compilers or different optimization options. My proposal is to return the bitwise OR combination of the two NAN's. If different bits in the NAN payload indicate different error conditions then the result will indicate if there are multiple error conditions. This feature can be turned off when strict conformance to the standard is needed. We should define different payload bits for different error conditions, such as 0/0, ∞ - ∞ , sqrt(-1), log(-1), asin(-2), etc. This has not been done, as far as I know.

About combining NANs, the problem faced here is that IEEE 754 is over specified and does not let you do what you think is the right thing to do. If it were unspecified, you could have done it. This is why undefined/unspecified behaviors can be good.

HubertLamontagne · Post by **HubertLamontagne** » 2018-07-18, 22:46:20

csdt wrote: ↑2018-07-18, 9:06:56 I would say your real-world "sum-of-x86-ARM-MIPS-PPC-68k-SPARC-ALPHA-ITANIUM-PARISC-SUPERH-32/64bit-compiled-on-GCC/LLVM/MSVC/ICC" has already so different behaviors that you cannot expect much more than what C is providing.
[...]

I beg to differ. There's a wide range of behaviors that are consistent across all of these but are undefined behavior in C/C++:

- integer ADD, SUB, MUL, &, |, ^, <, <=, >, >=, ==, != are consistent across all these architectures (both signed and unsigned)
- <<, >> and signed >> are consistent if the shift is in the 0..31 range
- uint8_t, int8_t, uint16_t, int16_t, uint32_t, int32_t, uint64_t, int64_t, short, int, long long, float and double are consistent. (char and long are broken but can easily be avoided)
- For 32bit platforms, pointer arithmetic is actually consistent, even in wild cases (64bit is a different issue though)
- Violations of strict aliasing are consistent except for little-endian vs big-endian, padding differences and alignment faults
- References pointing to address 0 (valid as long as you don't actually perform a load or store)

forwardcom forum

What to do with unclear standards?

What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?

Re: What to do with unclear standards?