在4GB的字符串流溢出(stringstream overflow at 4GB)

2019-09-21 20:18发布

我遇到了麻烦超过4GB的限制为stringstream的,即使是在有足够的内存64位Linux机器上运行。

测试代码如下(阅读您的意见后修订)4GB后核心转储。

从gdb的跟踪,字符串流使用默认的std :: char_traits其中INT_TYPE设置为32位整型,而不是64位的size_t。 任何建议来解决?

#include <stdint.h>
#include <iostream>
#include <sstream>
#include <math.h>
using namespace std;

int main(int narg, char** argv)
{
    string str;
    stringstream ss;
    const size_t GB = (size_t)pow(2, 30);

    str.resize(GB, '1');
    cerr << "1GB=" << str.size()
             << ", string::max_size=" << str.max_size()/GB << "GB"
             << ", sizeof(int)=" << sizeof(int)
             << ", sizeof(int64_t)=" << sizeof(int64_t)
             << ", sizeof(size_t)=" << sizeof(size_t)
             << endl;
    string().swap(str);

    str.resize(6*GB, '6');
    cerr << "str.size()=" << (str.size() / GB) << "GB allocated successfully"
            << ", ended with " << str.substr(str.size()-5, 5) << endl;
    string().swap(str);

    str.resize(GB/4, 'Q');
    cerr << "writing to stringstream..." << std::flush;
    for (int i = 0; i < 30; ++i) {
        ss << str << endl;
        cerr << double(ss.str().size())/GB << "GB " << std::flush;
    }
    cerr << endl;
    exit(0);
}

输出是:

1GB=1073741824, string::max_size=4294967295GB, sizeof(int)=4, sizeof(int64_t)=8, sizeof(size_t)=8
str.size()=6GB allocated successfully, ended with 66666
writing to stringstream...0.25GB 0.5GB 0.75GB 1GB 1.25GB 1.5GB 1.75GB 2GB 2.25GB 2.5GB 2.75GB 3GB 3.25GB 3.5GB 3.75GB Segmentation fault (core dumped)

gdb的堆栈跟踪:

(gdb) where
#0  0x00002aaaaad5e0c1 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow(int) () from /usr/lib64/libstdc++.so.6
#1  0x00002aaaaad62cbd in std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) () from /usr/lib64/libstdc++.so.6
#2  0x00002aaaaad5657d in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6
#3  0x000000000040112b in main ()

二进制似乎是64位。

$ file a.out
a.out: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped

尼克,在检查预处理器输出好点。 不知怎么的std :: char_straits覆盖__gnu_cxx :: char_straits,并重新定义INT_TYPE为int,而不是无符号只要__gnu_cxx :: char_straits。 这是非常令人惊讶!

namespace __gnu_cxx
{
# 61 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template <class _CharT>
    struct _Char_types
    {
      typedef unsigned long int_type;
      typedef std::streampos pos_type;
      typedef std::streamoff off_type;
      typedef std::mbstate_t state_type;
    };
# 86 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template<typename _CharT>
    struct char_traits
    {
      typedef _CharT char_type;
      typedef typename _Char_types<_CharT>::int_type int_type;
      typedef typename _Char_types<_CharT>::pos_type pos_type;
      typedef typename _Char_types<_CharT>::off_type off_type;
      typedef typename _Char_types<_CharT>::state_type state_type;
              ....
               };

namespace std
{
# 224 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template<class _CharT>
    struct char_traits : public __gnu_cxx::char_traits<_CharT>
    { };



  template<>
    struct char_traits<char>
    {
      typedef char char_type;
      typedef int int_type;
      typedef streampos pos_type;
      typedef streamoff off_type;
      typedef mbstate_t state_type;
              ...
              };

编辑:下面从STL网站http://www.cplusplus.com/reference/string/char_traits/其中char_traits专业化使用int类型。

typedef INT_T int_type; 
Where INT_T is a type that can represent all the valid characters representable by a    char_type plus an end-of-file value (eof) which is compatible with iostream class member functions.
For char_traits<char> this is int, and for char_traits<wchar_t> this is wint_t 

Answer 1:

您可以张贴预处理的源文件。

参考_Char_types删除,因为它是误导性的。

我不知道该INT_TYPE是相关的,专业化的焦炭确实将int用于INT_TYPE。

INT_TYPE是一个typedef保持单个字符(即,至少一个字节ASCII,用于wchar_t的更高)。 它不是用来存放一定范围的字符(见的std ::下面streamoff)。


从你的预处理输出我怀疑的std :: streamoff可能是罪魁祸首。 从我的系统上的标题:

# 90 "/usr/include/c++/4.6/bits/postypes.h" 3
  typedef long streamoff;

的std :: streamoff中的std ::其中的streampos如果定义不正确,我认为会导致stringstream的调用溢出功能使用。

你检查了正确的头都包括在内?

# 61 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3

是以下路径:

/usr/include/c++/4.1.2/bits/char_traits.h

-缺口



文章来源: stringstream overflow at 4GB