Accoring to http://www.theverge.com/2014/11/18/7242477/twitter-search-now-lets-you-find-any-tweet-ever-sent Twitter search now lets you find any tweet ever sent.
But when i am trying to get tweets from 2014 to 2015 using tweepy it gets only recent:
query = 'Nivea'
max_tweets = 1000
searched_tweets = [json.loads(status.json) for status in tweepy.Cursor(api.search,
q=query,
count=100,
#since_id="24012619984051000",
since="2014-02-01",
until="2015-02-01",
result_type="mixed",
lang="en"
).items(max_tweets)]
I tried since="2014-02-01", and since_id but no matter.
I use my own piece of code which uses a
HttpURLConnection
and a twitter search url. I then use a regular expression to pull out the last 20 matching tweets... Luckily as I'm deleting the tweets I can simply search again until I can't find anymore tweets. I'm including the code although it's in Java but the same would apply for any language. First I use a class to actually search for tweets and record their details:We then need the Tweet class itself so we can group Tweets up and do things with them, it's just a bean like this:
... and so that was all just standard java. To make use of the above code I use the Twitter4J API and do this:
That's it. I don't use comments but I hope it's easy to see what's going on and that this helps you out.
Unfortunately, you cannot access past data from Twitter. Is not a problem of what library you're using: Tweepy, Twitter4J, whatever, is just that Twitter won't provide any data that is older than more or less 2 weeks.
To get historical data you'll need access to firehose, directly through Twitter or third-party resellers like GNIP.