I am a novice programmer in python. I am having troubles trying to extract the text of a series of tweets with tweepy
and saving it to a text file (I ommit the authentication and stuff)
search = api.search("hello", count=10)
textlist=[]
for i in range(0,len(search)):
textlist.append( search[i].text.replace('\n', '' ) )
f = open('temp.txt', 'w')
for i in range(0,len(idlist)):
f.write(textlist[i].encode('utf-8') + '\n')
But in some long tweets the text at the end is truncated, and a three dot character "..." appears at the end of each string, so sometimes I lose links or hashtags. How can I avoid this?
The
...
(ellipsis) are added when the tweet is part of a retweet (and thus, is truncated). This is mentioned in the documentation:There is no way to avoid this, unless you take each individual tweet and then search any retweets of it and build the complete timeline (obviously this isn't practical for a simple search, you could do this if you were fetching a particular handle's timeline).
You can also simplify your code:
With tweepy, you can get the full text using
tweet_mode='extended'
(not documented in the Tweepy doc). For instance:(not extended)
(extended)
This is default behaviourfor retweets. You can access the full text under the
retweeted_status
object.Twitter API entities section about the change:
https://dev.twitter.com/overview/api/entities-in-twitter-objects#retweets
Twitter API documentation (look for "truncated")
https://dev.twitter.com/overview/api/tweets