Getting historical data from Twitter [closed]

2019-01-11 20:42发布

For a research project I would like to get the last 3 months worth of Twitter messages. Technical challenges aside, is this possible? by using some sort of slow polling mechanism to keep the rate limiter at bay?

The Twitter API states "Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API" Are these per hour? Per day? or...ever?

Any suggestions? Would it even be theoretically possible? Did some one do something similar before?

Thanks! Marco

7条回答
祖国的老花朵
2楼-- · 2019-01-11 20:52

Keyhole can get you historical tweets in xls or present them in a visual dashboard. The preview samples only a few most recent tweets, however, you can request historical data if you email them.

See: http://keyhole.co/conversation_tracking

查看更多
一纸荒年 Trace。
3楼-- · 2019-01-11 20:56

You could use the Search API, don't give it a search, return the maximum of 100 per page, then got through each page twice a minute(120 times an hour - 30 times less than the rate limit). However, if my math is correct, that could possibly give you 720,000 tweets an hour..... the problem is that Twitter has added approximately 1.75 billion tweets over the past 3 months. So if my math is correct, it would take you 2361 days, or 6 years to complete this.

You could ask this question over on the Twitter Development talk on Google Groups, or contact Twitter to get white-listed so you could make up to 20,000 requests an hour.

Personally, I don't think it's possible.

查看更多
Summer. ? 凉城
4楼-- · 2019-01-11 20:56

You can read the twitter historic data using Gnip's Historic PowerTrack tool. It will give you access to all twitter data since first tweet and fairly it is very simple tool t use.

查看更多
Anthone
5楼-- · 2019-01-11 20:57

Twitter notoriously does not make "available" tweets older than three weeks. In some cases you can only get one week. You're better off storing tweets for the next three months. Many rightly doubt if they're even persisted by Twitter.

Are you looking for just any tweets? If so, check out the Streaming API's status/sample method. The streaming API uses persistent HTTP sockets that can be a pain to program, but it's quite graceful when you get it working. I'd recommend setting up a little script to dump tweets from status/sample into a DB. You should have a TON of data after just a few days.

查看更多
在下西门庆
6楼-- · 2019-01-11 21:05

DataSift claims to have a twitter historical data api coming soon, you can signup to be notified when its available here.

查看更多
放荡不羁爱自由
7楼-- · 2019-01-11 21:07

This may not have existed when you first asked the question but the "PeopleBrowsr" API is perfect for this and you can go back 1400 days with a single API call: https://developer.peoplebrowsr.com/pb

Hope that helps!

查看更多
登录 后发表回答