Save full text of a tweet with tweepy

2020-07-23 04:31发布

I am a novice programmer in python. I am having troubles trying to extract the text of a series of tweets with tweepy and saving it to a text file (I ommit the authentication and stuff)

search = api.search("hello", count=10)

textlist=[]

for i in range(0,len(search)):
    textlist.append( search[i].text.replace('\n', '' ) )

f = open('temp.txt', 'w')
for i in range(0,len(idlist)):
    f.write(textlist[i].encode('utf-8') + '\n')

But in some long tweets the text at the end is truncated, and a three dot character "..." appears at the end of each string, so sometimes I lose links or hashtags. How can I avoid this?

3条回答
做个烂人
2楼-- · 2020-07-23 04:53

The ... (ellipsis) are added when the tweet is part of a retweet (and thus, is truncated). This is mentioned in the documentation:

Indicates whether the value of the text parameter was truncated, for example, as a result of a retweet exceeding the 140 character Tweet length. Truncated text will end in ellipsis, like this ...

There is no way to avoid this, unless you take each individual tweet and then search any retweets of it and build the complete timeline (obviously this isn't practical for a simple search, you could do this if you were fetching a particular handle's timeline).

You can also simplify your code:

results = api.search('hello', count=10)

with open('temp.txt', 'w') as f:
   for tweet in results:
       f.write('{}\n'.format(tweet.decode('utf-8')))
查看更多
老娘就宠你
3楼-- · 2020-07-23 05:13

With tweepy, you can get the full text using tweet_mode='extended' (not documented in the Tweepy doc). For instance:

(not extended)

print api.get_status('862328512405004288')._json['text']

@tousuncotefoot @equipedefrance @CreditAgricole @AntoGriezmann @KMbappe @layvinkurzawa @UmtitiSam J'ai jamais vue d… https://tco/kALZ2ki9Vc

(extended)

print api.get_status('862328512405004288', tweet_mode='extended')._json['full_text']

@tousuncotefoot @equipedefrance @CreditAgricole @AntoGriezmann @KMbappe @layvinkurzawa @UmtitiSam J'ai jamais vue de match de foot et cela ferait un beau cadeau pour mon copain !!

查看更多
太酷不给撩
4楼-- · 2020-07-23 05:18

This is default behaviourfor retweets. You can access the full text under the retweeted_status object.

Twitter API entities section about the change:

https://dev.twitter.com/overview/api/entities-in-twitter-objects#retweets

Twitter API documentation (look for "truncated")

https://dev.twitter.com/overview/api/tweets

查看更多
登录 后发表回答