In php, unpack() has the "*" flag which means "repeat this format until the end of input". For example, this prints 97, 98, 99
$str = "abc";
$b = unpack("c*", $str);
print_r($b);
Is there something like this in python? Of course, I can do
str = "abc"
print struct.unpack("b" * len(str), str)
but I'm wondering if there is a better way.
There is no such facility built into struct.unpack
, but it is possible to define such a function:
import struct
def unpack(fmt, astr):
"""
Return struct.unpack(fmt, astr) with the optional single * in fmt replaced with
the appropriate number, given the length of astr.
"""
# http://stackoverflow.com/a/7867892/190597
try:
return struct.unpack(fmt, astr)
except struct.error:
flen = struct.calcsize(fmt.replace('*', ''))
alen = len(astr)
idx = fmt.find('*')
before_char = fmt[idx-1]
n = (alen-flen)/struct.calcsize(before_char)+1
fmt = ''.join((fmt[:idx-1], str(n), before_char, fmt[idx+1:]))
return struct.unpack(fmt, astr)
print(unpack('b*','abc'))
# (97, 98, 99)
In Python 3.4 and later, you can use the new function struct.iter_unpack
.
struct.iter_unpack(fmt, buffer)
Iteratively unpack from the buffer buffer according to the format string fmt. This function returns an iterator which will read equally-sized chunks from the buffer until all its contents have been consumed. The buffer’s size in bytes must be a multiple of the size required by the format, as reflected by calcsize().
Each iteration yields a tuple as specified by the format string.
Let's say we want to unpack the array b'\x01\x02\x03'*3
with the repeating format string '<2sc'
(2 characters followed by a single character, repeat until done).
With iter_unpack
, you can do the following:
>>> import struct
>>> some_bytes = b'\x01\x02\x03'*3
>>> fmt = '<2sc'
>>>
>>> tuple(struct.iter_unpack(fmt, some_bytes))
((b'\x01\x02', b'\x03'), (b'\x01\x02', b'\x03'), (b'\x01\x02', b'\x03'))
If you want to un-nest this result, you can do so with itertools.chain.from_iterable
.
>>> from itertools import chain
>>> tuple(chain.from_iterable(struct.iter_unpack(fmt, some_bytes)))
(b'\x01\x02', b'\x03', b'\x01\x02', b'\x03', b'\x01\x02', b'\x03')
Of course, you could just employ a nested comprehension to do the same thing.
>>> tuple(x for subtuple in struct.iter_unpack(fmt, some_bytes) for x in subtuple)
(b'\x01\x02', b'\x03', b'\x01\x02', b'\x03', b'\x01\x02', b'\x03')