Convert integer to big endian binary file in pytho

2019-08-06 13:37发布

问题:

I'm trying to convert a 2D-array composed by integers to a big endian binary file using Python by this way:

import struct;

fh=open('file.bin','wb')
for i in range(width):
    for j in range(height):
        fh.write(struct.pack('>i2',data[i,j]))

fh.close()

when I open it with numpy:

a=np.fromfile('file.bin',dtype='>i2')

The result is an array with zeros between original data:

[266,   0, 267,   0, 268,
   0, 272,   0, 268,   0, 264,   0, 266,   0, 264,   0, 263,   0,
 263,   0, 267,   0, 263,   0, 265,   0, 266,   0, 266,   0, 267,
   0, 267,   0, 266,   0, 265,   0, 270,   0, 270,   0, 270,   0,
 272,   0, 273,   0, 275,   0, 274,   0, 275]

That's what I'm trying to obtain:

[266,   267,  268,  272,  268,   264,  266,  264,  263, 
 263,   267,  263,  265,  266,   266,  267,
 267,   266,  265,  270,  270,   270,  272,  273,  275,  274,  275]

Do you know what is wrong with my code?

回答1:

Replacing i2 with I2 works for me.

a = np.fromfile('file.bin',dtype='>I2')

Then again, the np.fromfile behaviour seems weird (it says >i2 is int16, but when you explicitly say np.int16 it outputs something else).

In [63]: np.fromfile('npfile.bin',dtype='>i2')
Out[63]: array([  0, 345,   0, 245,   0, 345,   0, 245], dtype=int16)

In [64]: np.fromfile('npfile.bin',dtype=np.int32)
Out[64]: array([1493237760, -184549376, 1493237760, -184549376], dtype=int32)

In [65]: np.fromfile('npfile.bin',dtype=np.uint32)
Out[65]: array([1493237760, 4110417920, 1493237760, 4110417920], dtype=uint32)

In [66]: np.fromfile('npfile.bin',dtype=np.int16)
Out[66]: array([    0, 22785,     0, -2816,     0, 22785,     0, -2816], dtype=int16)


回答2:

First of all, an integer is 4-byte long. When you pack the integer using the struct.pack module, you force it into 2 byte chunks which splits the integer into two short-ints; one significant block with the actual value and the other significant block with the zero.

Thus when you read it via numpy, it loads the values with the trailing zeroes.

As to how to solve it, simply replace the format string from '>i2' to '>i' both while packing and loading from numpy. It should give you the expected results.