Tweepy tracking terms and following users

2019-04-14 11:32发布

I'm trying to build an app to track some terms from specifics users using the streaming twitter API.

I made a working python script using tweepy for the streaming api based on this tutorial. But, it's only working if I track tweets by terms or by user ids, but now by both. When I try to search using both of them, the api returns me tweets from any user. My code is here:

#Acessando a API do twitter com as chaves
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token_key, access_token_secret)

#Chamando o Listener com o tweepy
api = tweepy.API(auth)

#Chama o stream e passa o que buscar no twitter.
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
list_users = ['11111','22222']   #Some ids
list_terms = ['term1','term2']   #Some terms
sapi.filter(follow=list_users, track=list_terms)

These two variables(list_users, list_terms) are lists of user ids and list of terms respectively.

How can I filter tweets stream by users AND by terms? Is there any way to do it with the tweepy filter? Or should I do a verification after retrieving the tweet?

1条回答
手持菜刀,她持情操
2楼-- · 2019-04-14 12:27

Twitter streaming API evaluates different conditions with OR logic, that is returns union of tweets with terms and from users. So you have to implement custom on_data function in order to filter with AND.

Note that you're limited to condition on up to 5000 users and 400 terms, and as rate limit may be an issue, so you'd supply api with a condition that yields lower tweet stream, and filter incoming data with all the rest conditions in post processing.

You can track up to 5,000 users and 400 keywords -- the rate limiting indeed takes effect at 1% of the Firehose, so if at any moment the tweet volume from the union of your keywords and users rises above 1% of all tweets happening in "real time" on the Firehose, you'll get up to 1% of the tweets along with a rate limit notice informing you of how many tweets you missed.

查看更多
登录 后发表回答