Difference between char and signed char in c++?

2019-01-23 12:11发布

问题:

Consider the following code :

#include <iostream>
#include <type_traits>

int main(int argc, char* argv[])
{
    std::cout<<"std::is_same<int, int>::value = "<<std::is_same<int, int>::value<<std::endl;
    std::cout<<"std::is_same<int, signed int>::value = "<<std::is_same<int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<int, unsigned int>::value = "<<std::is_same<int, unsigned int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, int>::value = "<<std::is_same<signed int, int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, signed int>::value = "<<std::is_same<signed int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, unsigned int>::value = "<<std::is_same<signed int, unsigned int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, int>::value = "<<std::is_same<unsigned int, int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, signed int>::value = "<<std::is_same<unsigned int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, unsigned int>::value = "<<std::is_same<unsigned int, unsigned int>::value<<std::endl;
    std::cout<<"----"<<std::endl;
    std::cout<<"std::is_same<char, char>::value = "<<std::is_same<char, char>::value<<std::endl;
    std::cout<<"std::is_same<char, signed char>::value = "<<std::is_same<char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<char, unsigned char>::value = "<<std::is_same<char, unsigned char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, char>::value = "<<std::is_same<signed char, char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, signed char>::value = "<<std::is_same<signed char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, unsigned char>::value = "<<std::is_same<signed char, unsigned char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, char>::value = "<<std::is_same<unsigned char, char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, signed char>::value = "<<std::is_same<unsigned char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, unsigned char>::value = "<<std::is_same<unsigned char, unsigned char>::value<<std::endl;
    return 0;
}

The result is :

std::is_same<int, int>::value = 1
std::is_same<int, signed int>::value = 1
std::is_same<int, unsigned int>::value = 0
std::is_same<signed int, int>::value = 1
std::is_same<signed int, signed int>::value = 1
std::is_same<signed int, unsigned int>::value = 0
std::is_same<unsigned int, int>::value = 0
std::is_same<unsigned int, signed int>::value = 0
std::is_same<unsigned int, unsigned int>::value = 1
----
std::is_same<char, char>::value = 1
std::is_same<char, signed char>::value = 0
std::is_same<char, unsigned char>::value = 0
std::is_same<signed char, char>::value = 0
std::is_same<signed char, signed char>::value = 1
std::is_same<signed char, unsigned char>::value = 0
std::is_same<unsigned char, char>::value = 0
std::is_same<unsigned char, signed char>::value = 0
std::is_same<unsigned char, unsigned char>::value = 1 

Which means that int and signed int are considered as the same type, but not char and signed char. Why is that ?

And if I can transform a char into signed char using make_signed, how to do the opposite (transform a signed char to a char) ?

回答1:

It's by design, C++ standard says char, signed char and unsigned char are different types. I think you can use static cast for transformation.



回答2:

There are three distinct basic character types: char, signed char and unsigned char. Although there are three character types, there are only two representations: signed and unsigned. The (plain)char uses one of these representations. Which of the other two character representations is equivalent to char depends on the compiler.

In an unsigned type, all the bits represent the value. For example, an 8-bit unsigned char can hold the values from 0 through 255 inclusive.

The standard does not define how signed types are represented, but does specify that the range should be evenly divided between positive and negative values. Hence an 8-bit signed char is guaranteed to be able to hold values from -127 through 127.


So how to decide which Type to use?

Computations using char are usually problematic. Char is by default signed on some machines and unsigned on others. So we should not use (plain)char in arithmetic expressions. Use it only to hold characters. If you need a tiny integer, explicitly specify either signed char or unsigned char.



回答3:

Indeed, the Standard is precisely telling that char, signed char and unsigned char are 3 different types. A char is usually 8 bits but this is not imposed by the standard. An 8-bit number can encode 256 unique values; the difference is only in how those 256 unique values are interpreted. If you consider a 8 bit value as a signed binary value, it can represent integer values from -128 (coded 80H) to +127. If you consider it unsigned, it can represent values 0 to 255. By the C++ standard, a signed char is guaranteed to be able to hold values -127 to 127 (not -128!), whereas a unsigned char is able to hold values 0 to 255.

When converting a char to an int, the result is implementation defined! the result may e.g. be -55 or 201 according to the machine implementation of the single char 'É' (ISO 8859-1). Indeed, a CPU holding the char in a word (16bits) can either store FFC9 or 00C9 or C900, or even C9FF (in big and little endian representations). Using signed or unsigned char do guarantee the char to int conversion outcome.