tweepy wait_on_rate_limit not working

2019-07-05 21:57发布

问题:

So, first off, I realize there's a number of questions regarding handling the twitter rate limits. I have no idea why, but none of the ones's I've found so far work for me.

I'm using tweepy. I'm trying to get a list of all the followers of the followers of a user. As expected, I can't pull everything down all at once due to twitter's rate limits. I have tweepy v 3.5 installed and thus am referring to http://docs.tweepy.org/en/v3.5.0/api.html. To get the list of followers of the originating user I use:

auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

followerIDs = []
for page in tweepy.Cursor(api.followers_ids, screen_name=originatingUser, wait_on_rate_limit = True, wait_on_rate_limit_notify = True).pages():
    followerIDs.extend(page)

followers = api.lookup_users(follower)

This works a for a bit but quickly turns into:

tweepy.error.TweepError: [{u'message': u'Rate limit exceeded', u'code': 88}]

My theory, would then to retrieve the followers of each user for each followerID using something like this:

for followerID in followerIDs:
        for page in tweepy.Cursor(api.followers_ids, id=followerID, wait_on_rate_limit = True, wait_on_rate_limit_notify = True).pages():
                followerIDs.extend(page)

The other problem I have is when I'm trying to look up the user names. For this, It use the grouper function from itertools to break the followers up into groups of 100 (api.lookup_users can only accept 100 id's at a time) and use

followerIDs = grouper(followerIDs,100)
for followerGroup in followerIDs:
        followerGroup=filter(None, followerGroup)
        followers = api.lookup_users(followerGroup,wait_on_rate_limit = True)
        for follower in followers:
                print (originatingUser + ", " + str(follower.screen_name))

That gets a different error, namely:

 TypeError: lookup_users() got an unexpected keyword argument 'wait_on_rate_limit'

which I'm finding confusing, becuase the tweepy api suggests that that should be an accepted argument.

Any ideas as to what I'm doing wrong?

Cheers Ben.

回答1:

I know this might be a little late, but here goes.

You pass the wait_on_rate_limit argument in the Cursor constructor, while the tweepy documentation states that it should be passed on the API() constructor.



回答2:

There is a rate limit for twitter API as mentioned here: https://dev.twitter.com/rest/public/rate-limiting

The quick solution to pass this could be catching the rate limit error and sleeping your application for a while then continue where you left.

pages = tweepy.Cursor(api.followers_ids, id=followerID).pages()
while True:
    try:
        page = pages.next()
        followerIDs.extend(page)
    except TweepError:
        time.sleep(60 * 15)
        continue
    except StopIteration:
        break

should do the trick. Not sure if this will work as you expect but the basic idea is this.