I am a novice programmer in python. I am having troubles trying to extract the text of a series of tweets with tweepy
and saving it to a text file (I ommit the authentication and stuff)
search = api.search("hello", count=10)
textlist=[]
for i in range(0,len(search)):
textlist.append( search[i].text.replace('\n', '' ) )
f = open('temp.txt', 'w')
for i in range(0,len(idlist)):
f.write(textlist[i].encode('utf-8') + '\n')
But in some long tweets the text at the end is truncated, and a three dot character "..." appears at the end of each string, so sometimes I lose links or hashtags. How can I avoid this?
This is default behaviourfor retweets. You can access the full text under the retweeted_status
object.
Twitter API entities section about the change:
https://dev.twitter.com/overview/api/entities-in-twitter-objects#retweets
Twitter API documentation (look for "truncated")
https://dev.twitter.com/overview/api/tweets
With tweepy, you can get the full text using tweet_mode='extended'
(not documented in the Tweepy doc). For instance:
(not extended)
print api.get_status('862328512405004288')._json['text']
@tousuncotefoot @equipedefrance @CreditAgricole @AntoGriezmann @KMbappe @layvinkurzawa @UmtitiSam J'ai jamais vue d… https://tco/kALZ2ki9Vc
(extended)
print api.get_status('862328512405004288', tweet_mode='extended')._json['full_text']
@tousuncotefoot @equipedefrance @CreditAgricole @AntoGriezmann @KMbappe @layvinkurzawa @UmtitiSam J'ai jamais vue de match de foot et cela ferait un beau cadeau pour mon copain !!