I'm looking for an equivalent to sscanf()
in Python. I want to parse /proc/net/*
files, in C I could do something like this:
int matches = sscanf(
buffer,
"%*d: %64[0-9A-Fa-f]:%X %64[0-9A-Fa-f]:%X %*X %*X:%*X %*X:%*X %*X %*d %*d %ld %*512s\n",
local_addr, &local_port, rem_addr, &rem_port, &inode);
I thought at first to use str.split
, however it doesn't split on the given characters, but the sep
string as a whole:
>>> lines = open("/proc/net/dev").readlines()
>>> for l in lines[2:]:
>>> cols = l.split(string.whitespace + ":")
>>> print len(cols)
1
Which should be returning 17, as explained above.
Is there a Python equivalent to sscanf
(not RE), or a string splitting function in the standard library that splits on any of a range of characters that I'm not aware of?
There is also the
parse
module.parse()
is designed to be the opposite offormat()
(the newer string formatting function in Python 2.6 and higher).There is an ActiveState recipe which implements a basic scanf http://code.activestate.com/recipes/502213-simple-scanf-implementation/
If the separators are ':', you can split on ':', and then use x.strip() on the strings to get rid of any leading or trailing whitespace. int() will ignore the spaces.
You can split on a range of characters using the
re
module.You can parse with module
re
using named groups. It won't parse the substrings to their actual datatypes (e.g.int
) but it's very convenient when parsing strings.Given this sample line from
/proc/net/tcp
:An example mimicking your sscanf example with the variable could be:
Upvoted orip's answer. I think it is sound advice to use re module. The Kodos application is helpful when approaching a complex regexp task with Python.
http://kodos.sourceforge.net/home.html