This question already has an answer here:
I have a a binary file format with a bunch of headers and floating point data. I am working on a code that parses the binary file. Reading the headers was not hard but when I tried to read the data I ran into some difficulties.
I opened the file and read the headers as the following:
ifs.open(fileName, std::ifstream::in | std::ifstream::binary);
char textHeader[3200];
BinaryHeader binaryHeader;
ifs.read(textHeader,sizeof(textHeader));
ifs.read(reinterpret_cast<char *>(&binaryHeader), sizeof(binaryHeader));
The documentation says the data is stored as: 4-byte IBM floating-point and I tried something similar:
vector<float> readData(int sampleSize){
float tmp;
std::vector<float> tmpVector;
for (int i = 0; i<sampleSize; i++){
ifs.read(reinterpret_cast<char *>(&tmp), sizeof(tmp));
std::cout << tmp << std::endl;
tmpVector.push_back(tmp);
}
return tmpVector;
}
Sadly the result does not seem correct. What do I do wrong?
EDIT: Forgot to mention, the binary data is in big-endian, but if I print the tmp values out the data does not seem correct either way.
Conclusion: The 4-byte IBM floating-point is not the same as the float.
There are a few things to consider:
The first one, I'm not 100% sure if this would make a difference or not, but you are using an array of chars for your header
char textHeader[3200];
. Maybe you could try changing this to an array ofunsigned char
instead...The second one in which I think may be a bigger issue which has to do more with performance is within your
readData
function itself. You are creating a local temporarystd::vector
offloats
on that functions stack frame. Then you are returning it. The return isn't even by reference or pointer so this will also create unnecessary copies, however by the time the next piece of code tries to use this vector, the temporary has already been destroyed since the function has already gone out of scope. For this issue I would probably suggest changing the declaration and definition of this function.I would change it from what you currently have:
to this:
RetiredNinja
as I was originally writing this, had asked you a very good question about the endian of the data type being stored. This can also be a major factor. The actual data representation that is physically stored in memory I think is the biggest concern here.According to the fact that your documentation has stated that it is stored as a 4-byte IBM floating-point type and that it is in big endian; I have found this specification by IBM that may be of help to you.