I have a binary file with "messages" and I am trying to fit the bytes inside the right variable using structs. In my example I used two types of messages: Tmessage and Amessage.
#include <iostream>
#include <fstream>
#include <stdlib.h>
#include <string>
#include <iomanip>
using namespace std;
struct Tmessage
{
unsigned short int Length;
char MessageType;
unsigned int Second;
};
struct Amessage
{
unsigned short int Length;
char MessageType;
unsigned int Timestamp;
unsigned long long int OrderReferenceNumber;
char BuySellIndicator;
unsigned int Shares;
char Stock[6];
unsigned int Price;
};
int main(int argc, char* argv[])
{
const char* filename = argv[1];
fstream file(filename, ios::in | ios::binary);
unsigned long long int pi = 0;
if(file.is_open()){ cout << filename << " OPENED" << endl; }
else { cout << "FILE NOT OPENED" << endl; }
unsigned char* memblock;
memblock = new unsigned char[128];
file.read((char *)memblock, 128);
cout << "BINARY DATA" << endl;
while (pi < 128)
{
cout << setw(2) << hex << static_cast<unsigned int>(memblock[pi]) << " ";
pi++;
if((pi%16)==0) cout << endl;
}
unsigned int poi = 0;
Tmessage *Trecord;
Trecord = (Tmessage *)memblock;
cout << "Length: " << hex << (*Trecord).Length << endl;
cout << "Message type: " << hex << (*Trecord).MessageType << endl;
cout << "Second: " << hex << (*Trecord).Second << endl;
poi = poi + 7; cout << endl;
Amessage *Arecord;
Arecord = (Amessage *)(memblock+poi);
cout << "Length: " << hex << (*Arecord).Length << endl;
cout << "Message type: " << hex << (*Arecord).MessageType << endl;
cout << "Timestamp: " << hex << (*Arecord).Timestamp << endl;
cout << "OrderReferenceNumber: " << hex << (*Arecord).OrderReferenceNumber << endl;
cout << "BuySellIndicator: " << hex << (*Arecord).BuySellIndicator << endl;
cout << "Shares: " << hex << (*Arecord).Shares << endl;
cout << "Stock: " << hex << (*Arecord).Stock << endl;
cout << "Price: " << hex << (*Arecord).Price << endl;
delete memblock;
file.close();
cout << endl << "THE END" << endl;
return 0;
}
The output when I run the program:
stream OPENED
BINARY DATA
0 5 54 0 0 62 72 0 1c 41 0 f 42 40 0 0
0 0 0 4 2f 76 53 0 0 3 e8 53 50 59 20 20
20 0 11 5 d0 0 1c 41 0 f 42 40 0 0 0 0
0 4 2f 78 42 0 0 3 e8 53 50 59 20 20 20 0
10 f7 5c 0 1c 41 0 f 42 40 0 0 0 0 0 4
2f 90 53 0 0 1 2c 53 50 59 20 20 20 0 11 2
b0 0 5 54 0 0 62 76 0 d 44 14 25 78 80 0
0 0 0 0 4 2f 90 0 d 44 14 25 78 80 0 0
Length: 500
Message type: T
Second: 726200
Length: 1c00
Message type: A
Timestamp: 40420f
OrderReferenceNumber: 53762f0400000000
BuySellIndicator:
Shares: 20595053
Stock:
Price: 420f0041
THE END
The program places the bytes inside the Tmessage struct correctly.
(0 5 54 0 0 62 72)
However, something occurs while parses Amessage.
(0 1c 41 0 f 42 40 0 0 0 0 0 4 2f 76 53 0 0 3 e8 53 50 59 20 20 20 0 11 5 d0)
The Lenght, MessageType and Timestamp are correct but OrderReferenceNumber contains the "53" byte which belongs to BuySellIndicator and then the other variable are incorrect.
The correct A message output should be:
Length: 1c 0
Message type: 41
Timestamp: 40 42 f 0
OrderReferenceNumber: 76 2f 4 0 0 0 0 0
BuySellIndicator: 53
Shares: e8 3 0 0
Stock: 53 50 59 20 20 20
Price: d0 5 11 0
The 2 questions: a) Why the OrderReferenceNumber contains the "53" byte? b) I think that "char Stock[6]" does not work, because between Share's bytes and Price's bytes there are more than 6 bytes. How can I fit the 6 bytes into the char vector or string?
Note: I am aware that I have to swap the bytes because the binary data comes in big-endian. That is why "Stock" should not be swapped. Thank you very much for your help! Kind regards,
There may be unnamed padding bytes between data members of a struct.
In order to read binary data from a file in a portable manner, you should read each member of the struct individually.
You should also use the exact width types specified in
<cstdint>
(Boost has an implementation of this header if your standard library doesn't have it yet); this will allow you to ensure that the sizes of your data members match the sizes of the fields in the message.The compiler is probably inserting pad bytes between members of your struct. One way you can get around this is to use pragma pack. Note that this is non-standard, but it works on g++ and visual C++.
What's going on in the code above is: the pragma pack tells the compiler you don't want it to insert padding to make it so that it'll be performing aligned access to members of the struct. the push/pop thing is so you can have nested #pragma packs (for example, when including header files) and have a way to go back to the previously set pack options.
See MSDN for an explanation that's probably better than the one I could give. http://msdn.microsoft.com/en-us/library/2e70t5y1%28VS.80%29.aspx