I am making a inverted index using hadoop and python. I want to know how can I include the byte offset of a line/word in python. I need something like this
hello hello.txt@1124
I need the locations for making a full inverted index. Please help.
I am making a inverted index using hadoop and python. I want to know how can I include the byte offset of a line/word in python. I need something like this
hello hello.txt@1124
I need the locations for making a full inverted index. Please help.
Like this?
Return the file’s current position, like stdio's ftell().
http://docs.python.org/library/stdtypes.html#file-objects
Unfortunately tell() does not function since OP is using stdin instead of a file. But it is not hard to build a wrapper around it to give what you need.
Then you can use this instead: