I saw in some question on Stack Exchange that the limitation can be a function of the number of requests per 15 minutes and depends also on the complexity of the algorithm, except that this is not a complex one.
So I use this code:
import tweepy
import sqlite3
import time
db = sqlite3.connect('data/MyDB.db')
# Get a cursor object
cursor = db.cursor()
cursor.execute('''CREATE TABLE IF NOT EXISTS MyTable(id INTEGER PRIMARY KEY, name TEXT, geo TEXT, image TEXT, source TEXT, timestamp TEXT, text TEXT, rt INTEGER)''')
db.commit()
consumer_key = ""
consumer_secret = ""
key = ""
secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(key, secret)
api = tweepy.API(auth)
search = "#MyHashtag"
for tweet in tweepy.Cursor(api.search,
q=search,
include_entities=True).items():
while True:
try:
cursor.execute('''INSERT INTO MyTable(name, geo, image, source, timestamp, text, rt) VALUES(?,?,?,?,?,?,?)''',(tweet.user.screen_name, str(tweet.geo), tweet.user.profile_image_url, tweet.source, tweet.created_at, tweet.text, tweet.retweet_count))
except tweepy.TweepError:
time.sleep(60 * 15)
continue
break
db.commit()
db.close()
I always get the Twitter limitation error:
Traceback (most recent call last):
File "stream.py", line 25, in <module>
include_entities=True).items():
File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 153, in next
self.current_page = self.page_iterator.next()
File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 98, in next
data = self.method(max_id = max_id, *self.args, **self.kargs)
File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 200, in _call
return method.execute()
File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 176, in execute
raise TweepError(error_msg, resp)
tweepy.error.TweepError: [{'message': 'Rate limit exceeded', 'code': 88}]
If you want to avoid errors and respect the rate limit you can use the following function which takes your
api
object as an argument. It retrieves the number of remaining requests of the same type as the last request and waits until the rate limit has been reset if desired.Just replace
with
For anyone who stumbles upon this on Google, tweepy 3.2+ has additional parameters for the tweepy.api class, in particular:
wait_on_rate_limit
– Whether or not to automatically wait for rate limits to replenishwait_on_rate_limit_notify
– Whether or not to print a notification when Tweepy is waiting for rate limits to replenishSetting these flags to
True
will delegate the waiting to the API instance, which is good enough for most simple use cases.The problem is that your
try: except:
block is in the wrong place. Inserting data into the database will never raise aTweepError
- it's iterating overCursor.items()
that will. I would suggest refactoring your code to call thenext
method ofCursor.items()
in an infinite loop. That call should be placed in thetry: except:
block, as it can raise an error.Here's (roughly) what the code should look like:
This works because when Tweepy raises a
TweepError
, it hasn't updated any of the cursor data. The next time it makes the request, it will use the same parameters as the request which triggered the rate limit, effectively repeating it until it goes though.