How to get byte offset in a file in python

I am doing an inverted index using hadoop and python. I want to know how to enable string / word byte offset in python. I need something like this

hello hello.txt@1124

      

I need places to create a full inverted index. Please help.

+1


source to share


1 answer


Like this?

file.tell()

      

Return the current position of the files, like stdio ftell ().

http://docs.python.org/library/stdtypes.html#file-objects



Unfortunately tell () doesn't work as the OP uses stdin instead of file. But it's not hard to create a wrapper around it to give what you need.

class file_with_pos(object):
    def __init__(self, fp):
        self.fp = fp
        self.pos = 0
    def read(self, *args):
        data = self.fp.read(*args)
        self.pos += len(data)
        return data
    def tell(self):
        return self.pos

      

Then you can use this instead:

fp = file_with_pos(sys.stdin)

      

+7


source







All Articles