Reading IBM floating-point in C++ [duplicate]

This question already has an answer here:

IBM Single Precision Floating Point data conversion to intended value 1 answer

I have a a binary file format with a bunch of headers and floating point data. I am working on a code that parses the binary file. Reading the headers was not hard but when I tried to read the data I ran into some difficulties.

I opened the file and read the headers as the following:

ifs.open(fileName, std::ifstream::in | std::ifstream::binary);
char textHeader[3200];
BinaryHeader binaryHeader;
ifs.read(textHeader,sizeof(textHeader));
ifs.read(reinterpret_cast<char *>(&binaryHeader), sizeof(binaryHeader));

The documentation says the data is stored as: 4-byte IBM floating-point and I tried something similar:

vector<float> readData(int sampleSize){
    float tmp;
    std::vector<float> tmpVector;
    for (int i = 0; i<sampleSize; i++){
        ifs.read(reinterpret_cast<char *>(&tmp), sizeof(tmp));
        std::cout << tmp << std::endl;
        tmpVector.push_back(tmp);
    }
    return tmpVector;
}

Sadly the result does not seem correct. What do I do wrong?

EDIT: Forgot to mention, the binary data is in big-endian, but if I print the tmp values out the data does not seem correct either way.

Conclusion: The 4-byte IBM floating-point is not the same as the float.

标签： c++ floating-point binaryfiles binaryreader fileparsing

1条回答

来，给爷笑一个

2楼-- · 2019-09-22 08:30

There are a few things to consider:

The first one, I'm not 100% sure if this would make a difference or not, but you are using an array of chars for your header char textHeader[3200];. Maybe you could try changing this to an array of unsigned char instead...
The second one in which I think may be a bigger issue which has to do more with performance is within your readData function itself. You are creating a local temporary std::vector of floats on that functions stack frame. Then you are returning it. The return isn't even by reference or pointer so this will also create unnecessary copies, however by the time the next piece of code tries to use this vector, the temporary has already been destroyed since the function has already gone out of scope. For this issue I would probably suggest changing the declaration and definition of this function.

I would change it from what you currently have:

vector<float> readData(int sampleSize)

to this:

void readData( int sampleSizes, std::vector<float>& data )

The third which is probably the most important of the three was mentioned in a form of a question in your comments by user RetiredNinja as I was originally writing this, had asked you a very good question about the endian of the data type being stored. This can also be a major factor. The actual data representation that is physically stored in memory I think is the biggest concern here.

According to the fact that your documentation has stated that it is stored as a 4-byte IBM floating-point type and that it is in big endian; I have found this specification by IBM that may be of help to you.

0人赞添加讨论(0) 举报

Reading IBM floating-point in C++ [duplicate]

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间