Get twitter followers using tweepy and multiple AP

2020-02-29 01:57发布

问题:

I have multiple twitter dev keys that I am using to get followers from a list of handles. There are two ways I can do this but have a problem with both. The first:

try:
    ....
    for user in tweepy.Cursor(api.followers, screen_name=screenName).items():
    ....
except tweepy.TweepError as e:

    errorCode = e.message[0]['code']
        if errorCode == 88:
            print "Rate limit exceeded."
            rotateKeys()

The issue here is that every time I rotate keys, the for loop starts from scratch and starts getting the followers again. I tried to get around this but splitting the for loop:

try:
    items = tweepy.Cursor(api.followers, screen_name=s).items()

I then loop through them manually using next(items)

However rotating api keys does not work as the initial call was done with the first API code and will always try to use that one.

I need a way to rotate keys and continue from were the previous left of.

回答1:

You can get the cursor that was used when the rate limit occured through the next_cursor variable on the iterator being used. When you create a new Cursor using the new API instance, you can pass the previous cursor as a parameter:

current_cursor = cursor.iterator.next_cursor
# re-create the cursor using the new api instance
cursor = tweepy.Cursor(api.followers, screen_name=s, cursor=current_cursor)


回答2:

I actually had to end up abandoning the cursored method in favor of manually setting the next cursor. The nice thing about this is that the "non-cursored" method returns the previous and next cursor as part of it's function.

Here's how I achieved what you are going for (note: adding a try/catch is probably in order):

users = ['user_one', 'user_two', 'user_three']

current_profile = 9 # I HAVE TEN IN AN ARRAY

tweepy_api = get_api(auth_profiles[current_profile]) #A FUNCTION I CREATED TO REINITIALIZE API'S

for user in users:

    next_cursor = -1 # START EVERY NEW USER RETRIEVAL WITH -1

    print 'CURRENT USER:', user, 'STARTING CURSOR:', next_cursor

    while next_cursor: # THAT IS, WHILE CURSOR IS NOT ZERO

        print 'AUTH PROFILE', current_profile, 'CURRENT CURSOR:', next_cursor

        # RETURNS A TUPLE WITH ELEMENT[0] A LIST OF IDS, ELEMENT [1][0] PREVIOUS CURSOR, AND ELEMENT[1][1] NEXT CURSOR
        ids, cursors = tweepy_api.followers_ids(screen_name=user, count=5000, cursor=next_cursor)

        next_cursor = cursors[1] # STORE NEXT CURSOR

        # FUNCTION I CREATED TO GET STATUS FROM API.rate_limit_status()
        status = get_rate_limit_status(tweepy_api, '/followers/ids')

        print 'ID\'S RETRIEVED:', len(ids), 'NEXT CURSOR:', cursors[1], 'REMAINING:', status['remaining']

        if not status['remaining']: # IF STATUS IS REMAINING IS ZERO

            print ''
            print 'RATE LIMIT REACHED'

            if current_profile < len(auth_profiles) - 1: # IF THE CURRENT PROFILE IS LESS THAN NINE (IN MY CASE)

                print 'INCREMENTING CURRENT PROFILE:', current_profile, '<', len(auth_profiles) - 1

                current_profile += 1 # INCREMENT THE PROFILE

                print 'CURRENT PROFILE:', current_profile

            else: # ELSE, IT MUST EQUAL NINE (COULD BE NEG I SUPPOSE BUT...)

                print 'RESETTING CURRENT PROFILE TO ZERO:', current_profile, '=', len(auth_profiles) - 1

                current_profile = 0 # RESET CURRENT PROFILE TO THE BEGINNING

                print 'CURRENT PROFILE:', current_profile

            tweepy_api = get_api(auth_profiles[current_profile]) # GET NEW TWEEPY API WITH NEW AUTH
            print ''

The output should be something like this (I've removed some of the print statements for simplicity):

CURRENT USER: user_one STARTING CURSOR: -1
AUTH PROFILE 9 CURRENT CURSOR: -1

ID'S RETRIEVED: 5000 NEXT CURSOR: 1594511885763407081 REMAINING: 14
…
ID'S RETRIEVED: 5000 NEXT CURSOR: 1582249691352919104 REMAINING: 0

RATE LIMIT REACHED
RESETTING CURRENT PROFILE TO ZERO: 9 = 9
CURRENT PROFILE: 0

ID'S RETRIEVED: 5000 NEXT CURSOR: 1580277475971792716 REMAINING: 14
…
ID'S RETRIEVED: 4903 NEXT CURSOR: 0 REMAINING: 7

CURRENT USER: user_two STARTING CURSOR: -1
AUTH PROFILE 0 CURRENT CURSOR: -1

ID'S RETRIEVED: 5000 NEXT CURSOR: 1592820762836029887 REMAINING: 6
…
ID'S RETRIEVED: 5000 NEXT CURSOR: 1592737463603654258 REMAINING: 0

RATE LIMIT REACHED
INCREMENTING CURRENT PROFILE: 0 < 9
CURRENT PROFILE: 1

As a side note, if you are going to use a cursored version, at least in Tweepy 3.5.0 the prev_cursor and next_cursor are stored in cursor.iterator.next_cursor, cursor.iterator.prev_cursor. I think this is also the case for 3.6.0 (see Cursor and CursorIterator in cursor.py)

For me, cursor.page_iterator.next_cursor returns:

AttributeError: 'Cursor' object has no attribute 'page_iterator'