Why QVector::size returns int?

2019-01-18 18:25发布

问题:

std::vector::size() returns a size_type which is unsigned and usually the same as size_t, e.g. it is 8 bytes on 64bit platforms.

In constrast, QVector::size() returns an int which is usually 4 bytes even on 64bit platforms, and at that it is signed, which means it can only go half way to 2^32.

Why is that? This seems quite illogical and also technically limiting, and while it is nor very likely that you may ever need more than 2^32 number of elements, the usage of signed int cuts that range in half for no apparent good reason. Perhaps to avoid compiler warnings for people too lazy to declare i as a uint rather than an int who decided that making all containers return a size type that makes no sense is a better solution? The reason could not possibly be that dumb?

回答1:

This has been discussed several times since Qt 3 at least and the QtCore maintainer expressed that a while ago no change would happen until Qt 7 if it ever does.

When the discussion was going on back then, I thought that someone would bring it up on Stack Overflow sooner or later... and probably on several other forums and Q/A, too. Let us try to demystify the situation.

In general you need to understand that there is no better or worse here as QVector is not a replacement for std::vector. The latter does not do any Copy-On-Write (COW) and that comes with a price. It is meant for a different use case, basically. It is mostly used inside Qt applications and the framework itself, initially for QWidgets in the early times.

size_t has its own issue, too, after all that I will indicate below.

Without me interpreting the maintainer to you, I will just quote Thiago directly to carry the message of the official stance on:

For two reasons:

1) it's signed because we need negative values in several places in the API: indexOf() returns -1 to indicate a value not found; many of the "from" parameters can take negative values to indicate counting from the end. So even if we used 64-bit integers, we'd need the signed version of it. That's the POSIX ssize_t or the Qt qintptr.

This also avoids sign-change warnings when you implicitly convert unsigneds to signed:

-1 + size_t_variable        => warning
size_t_variable - 1     => no warning

2) it's simply "int" to avoid conversion warnings or ugly code related to the use of integers larger than int.

io/qfilesystemiterator_unix.cpp

size_t maxPathName = ::pathconf(nativePath.constData(), _PC_NAME_MAX);
if (maxPathName == size_t(-1))

io/qfsfileengine.cpp

if (len < 0 || len != qint64(size_t(len))) {

io/qiodevice.cpp

qint64 QIODevice::bytesToWrite() const
{
    return qint64(0);
}

return readSoFar ? readSoFar : qint64(-1);

That was one email from Thiago and then there is another where you can find some detailed answer:

Even today, software that has a core memory of more than 4 GB (or even 2 GB) is an exception, rather than the rule. Please be careful when looking at the memory sizes of some process tools, since they do not represent actual memory usage.

In any case, we're talking here about having one single container addressing more than 2 GB of memory. Because of the implicitly shared & copy-on-write nature of the Qt containers, that will probably be highly inefficient. You need to be very careful when writing such code to avoid triggering COW and thus doubling or worse your memory usage. Also, the Qt containers do not handle OOM situations, so if you're anywhere close to your memory limit, Qt containers are the wrong tool to use.

The largest process I have on my system is qtcreator and it's also the only one that crosses the 4 GB mark in VSZ (4791 MB). You could argue that it is an indication that 64-bit containers are required, but you'd be wrong:

  • Qt Creator does not have any container requiring 64-bit sizes, it simply needs 64-bit pointers

  • It is not using 4 GB of memory. That's just VSZ (mapped memory). The total RAM currently accessible to Creator is merely 348.7 MB.

  • And it is using more than 4 GB of virtual space because it is a 64-bit application. The cause-and-effect relationship is the opposite of what you'd expect. As a proof of this, I checked how much virtual space is consumed by padding: 800 MB. A 32-bit application would never do that, that's 19.5% of the addressable space on 4 GB.

(padding is virtual space allocated but not backed by anything; it's only there so that something else doesn't get mapped to those pages)

Going into this topic even further with Thiago's responses, see this:

Personally, I'm VERY happy that Qt collection sizes are signed. It seems nuts to me that an integer value potentially used in an expression using subtraction be unsigned (e.g. size_t).

An integer being unsigned doesn't guarantee that an expression involving that integer will never be negative. It only guarantees that the result will be an absolute disaster.

On the other hand, the C and C++ standards define the behaviour of unsigned overflows and underflows.

Signed integers do not overflow or underflow. I mean, they do because the types and CPU registers have a limited number of bits, but the standards say they don't. That means the compiler will always optimise assuming you don't over- or underflow them.

Example:

for (int i = 1; i >= 1; ++i)

This is optimised to an infinite loop because signed integers do not overflow. If you change it to unsigned, then the compiler knows that it might overflow and come back to zero.

Some people didn't like that: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475



回答2:

unsigned numbers are values mod 2^n for some n.

Signed numbers are bounded integers.

Using unsigned values as approximations for 'positive integers' runs into the problem that common values are near the edge of the domain where unsigned values behave differently than plain integers.

The advantage is that unsigned approximation reaches higher positive integers, and under/overflow are well defined (if random when looked at as a model of Z).

But really, ptrdiff_t would be better than int.