Thread memory usage keeps increasing

I am trying to visit the webpages and check if the website owner allows to contact him or not..

This is the function that each thread calls:

def getpage():
    try:
        curl = urls.pop(0)
        print "working on " +str(curl)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass
    finally:
        if len(urls)>0 :
            getpage()

But the thing is memory of program keep on getting increased.. (pythonw.exe)

As the thread calling the function again the condition is true .. the memory of the program should stay at least approximately at the same level.

For a list containing about 100k URLs, the program is taking much more than 3GB and increasing...

标签： python multithreading memory

2条回答

SAY GOODBYE

2楼-- · 2019-07-21 17:05

I had a look at your code: http://pastebin.com/J4Rd3NhA

I would use join while 100 threads run:

for xd in range(0,noofthreads):
    t = threading.Thread(target=getpage)
    t.daemon = True
    t.start()
    tarray.append(t)
    # my additional code
    if len(tarray) >= 100:
        tarray[-100].join()

How does this perform? If something is wrong, tell me.

0人赞添加讨论(0) 举报

▲ chillily

3楼-- · 2019-07-21 17:10

Your program is recursive for no reason. The recursion means that for each page you get you create a new set of variables, and since these are still being referenced by the local variables in the function, since the function never ends, the garbage collection never comes into play, and it will continue to eat memory for ever.

Read up on the while statement, it's the one you want to use instead of recursion here.

while len(urls)>0 :
    try:
        curl = urls.pop(0)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass

0人赞添加讨论(0) 举报

Thread memory usage keeps increasing

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间