I was looking at this video. Bjarne Stroustrup says that unsigned ints are error prone and lead to bugs. So, you should only use them when you really need them. I've also read in one of the question on Stack Overflow (but I don't remember which one) that using unsigned ints can lead to security bugs.
How do they lead to security bugs? Can someone clearly explain it by giving an suitable example?
Numeric conversion rules in C and C++ are a byzantine mess. Using unsigned types exposes yourself to that mess to a much greater extent than using purely signed types.
Take for example the simple case of a comparison between two variables, one signed and the other unsigned.
To take another example consider multiplying two unsigned integers of the same size.
One possible aspect is that unsigned integers can lead to somewhat hard-to-spot problems in loops, because the underflow leads to large numbers. I cannot count (even with an unsigned integer!) how many times I made a variant of this bug
Note that, by definition,
i >= 0
is always true. (What causes this in the first place is that ifi
is signed, the compiler will warn about a possible overflow with thesize_t
ofsize()
).There are other reasons mentioned Danger – unsigned types used here!, the strongest of which, in my opinion, is the implicit type conversion between signed and unsigned.
One big factor is that it makes loop logic harder: Imagine you want to iterate over all but the last element of an array (which does happen in the real world). So you write your function:
Looks good, doesn't it? It even compiles cleanly with very high warning levels! (Live) So you put this in your code, all tests run smoothly and you forget about it.
Now, later on, somebody comes along an passes an empty
vector
to your function. Now with a signed integer, you hopefully would have noticed the sign-compare compiler warning, introduced the appropriate cast and not have published the buggy code in the first place.But in your implementation with the unsigned integer, you wrap and the loop condition becomes
i < SIZE_T_MAX
. Disaster, UB and most likely crash!This is also a security problem, in particular it is a buffer overflow. One way to possibly exploit this would be if
do_something
would do something that can be observed by the attacker. They might be able to find what input went intodo_something
, and that way data the attacker should not be able to access would be leaked from your memory. This would be a scenario similar to the Heartbleed bug. (Thanks to ratchet freak for pointing that out in a comment.)In addition to range/warp issue with unsigned types. Using mix of unsigned and signed integer types impact significant performance issue for processor. Less then floating point cast, but quite a lot to ignore that. Additionally compiler may place range check for the value and change the behavior of further checks.
I'm not going to watch a video just to answer a question, but one issue is the confusing conversions which can happen if you mix signed and unsigned values. For example:
The promotion rules mean that
i
is converted tounsigned
for the comparison, giving a large positive number and a surprising result.The problem with unsigned integer types is that depending upon their size they may represent one of two different things:
int
(e.g.uint8
) hold numbers in the range 0..2ⁿ-1, and calculations with them will behave according to the rules of integer arithmetic provided they don't exceed the range of theint
type. Under present rules, if such a calculation exceeds the range of anint
, a compiler is allowed to do anything it likes with the code, even going so far as to negate the laws of time and causality (some compilers will do precisely that!), and even if the result of the calculation would be assigned back to an unsigned type smaller thanint
.unsigned int
and larger hold members of the abstract wrapping algebraic ring of integers congruent mod 2ⁿ; this effectively means that if a calculation goes outside the range 0..2ⁿ-1, the system will add or subtract whatever multiple of 2ⁿ would be required to get the value back in range.Consequently, given
uint32_t x=1, y=2;
the expressionx-y
may have one of two meanings depending upon whetherint
is larger than 32 bits.int
is larger than 32 bits, the expression will subtract the number 2 from the number 1, yielding the number -1. Note that while a variable of typeuint32_t
can't hold the value -1 regardless of the size ofint
, and storing either -1 would cause such a variable to hold 0xFFFFFFFF, but unless or until the value is coerced to an unsigned type it will behave like the signed quantity -1.int
is 32 bits or smaller, the expression will yield auint32_t
value which, when added to theuint32_t
value 2, will yield theuint32_t
value 1 (i.e. theuint32_t
value 0xFFFFFFFF).IMHO, this problem could be solved cleanly if C and C++ were to define new unsigned types [e.g. unum32_t and uwrap32_t] such that a
unum32_t
would always behave as a number, regardless of the size ofint
(possibly requiring the right-hand operation of a subtraction or unary minus to be promoted to the next larger signed type ifint
is 32 bits or smaller), while awrap32_t
would always behave as a member of an algebraic ring (blocking promotions even ifint
were larger than 32 bits). In the absence of such types, however, it's often impossible to write code which is both portable and clean, since portable code will often require type coercions all over the place.