I have a file which mixes binary data and text data. I want to parse it through a regular expression, but I get this error:
TypeError: can't use a string pattern on a bytes-like object
I'm guessing that message means that Python doesn't want to parse binary files.
I'm opening the file with the "rb"
flags.
How can I parse binary files with regular expressions in Python?
EDIT: I'm using Python 3.2.0
In your
re.compile
you need to use abytes
object, signified by an initialb
:This is Python 3 being picky about the difference between strings and bytes.
This is working for me for python 2.6
I think you use Python 3 .
Then, in Python 3, since a binary stream from a file is a stream of bytes, a regex to analyse a stream from a file must be defined with a sequence of bytes, not a sequence of characters.
and
and
So you will define your regex as follows
and not as
More explanations here:
15.6.4. Can’t use a string pattern on a bytes-like object