I have a program that fetches info from other pages and parses them using BeautifulSoup and Twisted's getPage. Later on in the program I print info that the deferred process creates. Currently my program tries to print it before the differed returns the info. How can I make it wait?
def twisAmaz(contents): #This parses the page (amazon api xml file)
stonesoup = BeautifulStoneSoup(contents)
if stonesoup.find("mediumimage") == None:
imageurl.append("/images/notfound.png")
else:
imageurl.append(stonesoup.find("mediumimage").url.contents[0])
usedPdata = stonesoup.find("lowestusedprice")
newPdata = stonesoup.find("lowestnewprice")
titledata = stonesoup.find("title")
reviewdata = stonesoup.find("editorialreview")
if stonesoup.find("asin") != None:
asin.append(stonesoup.find("asin").contents[0])
else:
asin.append("None")
reactor.stop()
deferred = dict()
for tmpISBN in isbn: #Go through ISBN numbers and get Amazon API information for each
deferred[(tmpISBN)] = getPage(fetchInfo(tmpISBN))
deferred[(tmpISBN)].addCallback(twisAmaz)
reactor.run()
.....print info on each ISBN
What it seems like is you're trying to make/run multiple reactors. Everything gets attached to the same reactor. Here's how to use a DeferredList
to wait for all of your callbacks to finish.
Also note that twisAmaz
returns a value. That value is passed through the callbacks
DeferredList
and comes out as value
. Since a DeferredList
keeps the order of the things that are put into it, you can cross-reference the index of the results with the index of your ISBNs.
from twisted.internet import defer
def twisAmazon(contents):
stonesoup = BeautifulStoneSoup(contents)
ret = {}
if stonesoup.find("mediumimage") is None:
ret['imageurl'] = "/images/notfound.png"
else:
ret['imageurl'] = stonesoup.find("mediumimage").url.contents[0]
ret['usedPdata'] = stonesoup.find("lowestusedprice")
ret['newPdata'] = stonesoup.find("lowestnewprice")
ret['titledata'] = stonesoup.find("title")
ret['reviewdata'] = stonesoup.find("editorialreview")
if stonesoup.find("asin") is not None:
ret['asin'] = stonesoup.find("asin").contents[0]
else:
ret['asin'] = 'None'
return ret
callbacks = []
for tmpISBN in isbn: #Go through ISBN numbers and get Amazon API information for each
callbacks.append(getPage(fetchInfo(tmpISBN)).addCallback(twisAmazon))
def printResult(result):
for e, (success, value) in enumerate(result):
print ('[%r]:' % isbn[e]),
if success:
print 'Success:', value
else:
print 'Failure:', value.getErrorMessage()
callbacks = defer.DeferredList(callbacks)
callbacks.addCallback(printResult)
reactor.run()
Another cool way to do this is with @defer.inlineCallbacks. It lets you write asynchronous code like a regular sequential function: http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html#inlineCallbacks
First, you shouldn't put a reactor.stop() in your deferred method, as it kills everything.
Now, in Twisted, "Waiting" is not allowed. To print results of you callback, just add another callback after the first one.