Tweepy StreamListener to CSV

2020-07-22 10:19发布

问题:

I'm a newbie on python and I'm trying to develop an app that retrieves data from Twitter with Tweepy and the Streaming APIs and converts the data on a CSV file. The problem is that this code doesn't create an output CSV file, maybe because I should set the code to stop when it achieves for eg. 1000 tweets but I'm not able to set this stop point

here's the code

import sys
import tweepy
import csv

#pass security information to variables
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""


#use variables to access twitter
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

#create an object called 'customStreamListener'

class CustomStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print (status.author.screen_name, status.created_at, status.text)


    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream


streamingAPI = tweepy.streaming.Stream(auth, CustomStreamListener())
streamingAPI.filter(track=['Dallas', 'NewYork'])

def on_status(self, status):
    with open('OutputStreaming.txt', 'w') as f:
        f.write('Author,Date,Text')
        writer = csv.writer(f)
        writer.writerow([status.author.screen_name, status.created_at, status.text])

Any suggestion?

回答1:

The function that you are trying to write the csv with is never called. I assume you wanted to write this code in CustomStreamListener.on_status. Also, you have to write the titles to the file once (outside the stream listener). Take a look at this code:

import sys
import tweepy
import csv

#pass security information to variables
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""


#use variables to access twitter
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

#create an object called 'customStreamListener'

class CustomStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print (status.author.screen_name, status.created_at, status.text)
        # Writing status data
        with open('OutputStreaming.txt', 'a') as f:
            writer = csv.writer(f)
            writer.writerow([status.author.screen_name, status.created_at, status.text])


    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream

# Writing csv titles
with open('OutputStreaming.txt', 'w') as f:
    writer = csv.writer(f)
    writer.writerow(['Author', 'Date', 'Text'])

streamingAPI = tweepy.streaming.Stream(auth, CustomStreamListener())
streamingAPI.filter(track=['Dallas', 'NewYork'])