I'm trying to read the source code from a website 100 lines at a time
For example:
self.code = urllib.request.urlopen(uri)
#Get 100 first lines
self.lines = self.getLines()
...
#Get 100 next lines
self.lines = self.getLines()
My getLines code is like this:
def getLines(self):
res = []
i = 0
while i < 100:
res.append(str(self.code.readline()))
i+=1
return res
But the problem is that getLines()
always returns the first 100 lines of the code.
I've seen some solutions with next()
or tell()
and seek()
, but it seems that those functions are not implemented in HTTPResponse class.
This worked for me.
according to the documentation
urllib.request.urlopen(uri)
returns a file like object, so you should be able to do:there's more information on
islice
in the itertools documentation. Using iterators will avoid thewhile
loop and manual increments.If you absolutely must use
readline()
, it's advisable to use afor
loop, i.e.