How can I convert a string of bytes into an int in python?
Say like this: 'y\xcc\xa6\xbb'
I came up with a clever/stupid way of doing it:
sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))
I know there has to be something builtin or in the standard library that does this more simply...
This is different from converting a string of hex digits for which you can use int(xxx, 16), but instead I want to convert a string of actual byte values.
UPDATE:
I kind of like James' answer a little better because it doesn't require importing another module, but Greg's method is faster:
>>> from timeit import Timer
>>> Timer('struct.unpack("<L", "y\xcc\xa6\xbb")[0]', 'import struct').timeit()
0.36242198944091797
>>> Timer("int('y\xcc\xa6\xbb'.encode('hex'), 16)").timeit()
1.1432669162750244
My hacky method:
>>> Timer("sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))").timeit()
2.8819329738616943
FURTHER UPDATE:
Someone asked in comments what's the problem with importing another module. Well, importing a module isn't necessarily cheap, take a look:
>>> Timer("""import struct\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""").timeit()
0.98822188377380371
Including the cost of importing the module negates almost all of the advantage that this method has. I believe that this will only include the expense of importing it once for the entire benchmark run; look what happens when I force it to reload every time:
>>> Timer("""reload(struct)\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""", 'import struct').timeit()
68.474128007888794
Needless to say, if you're doing a lot of executions of this method per one import than this becomes proportionally less of an issue. It's also probably i/o cost rather than cpu so it may depend on the capacity and load characteristics of the particular machine.
As Greg said, you can use struct if you are dealing with binary values, but if you just have a "hex number" but in byte format you might want to just convert it like:
...this is the same as:
...except it'll work for any number of bytes.
You can also use the struct module to do this:
In Python 3.2 and later, use
or
according to the endianness of your byte-string.
This also works for bytestring-integers of arbitrary length, and for two's-complement signed integers by specifying
signed=True
. See the docs forfrom_bytes
.I use the following function to convert data between int, hex and bytes.
Source: http://opentechnotes.blogspot.com.au/2014/04/convert-values-to-from-integer-hex.html
Warning: the above is strongly platform-specific. Both the "I" specifier and the endianness of the string->int conversion are dependent on your particular Python implementation. But if you want to convert many integers/strings at once, then the array module does it quickly.
In Python 2.x, you could use the format specifiers
<B
for unsigned bytes, and<b
for signed bytes withstruct.unpack
/struct.pack
.E.g:
Let
x
='\xff\x10\x11'
And:
That
*
is required!See https://docs.python.org/2/library/struct.html#format-characters for a list of the format specifiers.