How To Get All Tweets on Hashtag using LinqToTwitt

2019-02-05 10:29发布

问题:

I'm trying to get all tweets(count total tweet number) belong to hashtag. My function is here, how to I use maxID and sinceID for get all tweets. What is the instead of "count"? I dont'know.

if (maxid != null)
        {
            var searchResponse =
                await
                (from search in ctx.Search
                 where search.Type == SearchType.Search &&
                 search.Query == "#karne" &&
                 search.Count == Convert.ToInt32(count)
                 select search)
                 .SingleOrDefaultAsync();

            maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);

            foreach (var tweet in searchResponse.Statuses)
            {
                try
                {
                    ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                    tweetcount++;
                }
                catch {}
            }

            while (maxid != null && tweetcount < Convert.ToInt32(count))
            {
                maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);
                searchResponse =
                    await
                    (from search in ctx.Search
                     where search.Type == SearchType.Search &&
                     search.Query == "#karne" &&
                     search.Count == Convert.ToInt32(count) && 
                     search.MaxID == Convert.ToUInt64(maxid)
                     select search)
                     .SingleOrDefaultAsync();
                foreach (var tweet in searchResponse.Statuses)
                {
                    try
                    {
                        ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                        tweetcount++;
                    }
                    catch { }
                }
            }

        }

回答1:

Here's an example. Remember that MaxID is for the current session and prevents re-reading tweets you've already processed in the current session. SinceID is the oldest tweet you've ever received for this search term and helps you avoid re-reading tweets that you've already processed for this search term during previous sessions. Essentially, you're creating a window where MaxID is the newest tweet to get next and SinceID is the oldest tweet that you don't want to read past. On the first session for a given search term, you would set SinceID to 1 because you don't have an oldest tweet yet. After the session, save SinceID so that you don't accidentally re-read tweets.

    static async Task DoPagedSearchAsync(TwitterContext twitterCtx)
    {
        const int MaxSearchEntriesToReturn = 100;

        string searchTerm = "twitter";

        // oldest id you already have for this search term
        ulong sinceID = 1;

        // used after the first query to track current session
        ulong maxID; 

        var combinedSearchResults = new List<Status>();

        List<Status> searchResponse =
            await
            (from search in twitterCtx.Search
             where search.Type == SearchType.Search &&
                   search.Query == searchTerm &&
                   search.Count == MaxSearchEntriesToReturn &&
                   search.SinceID == sinceID
             select search.Statuses)
            .SingleOrDefaultAsync();

        combinedSearchResults.AddRange(searchResponse);
        ulong previousMaxID = ulong.MaxValue;
        do
        {
            // one less than the newest id you've just queried
            maxID = searchResponse.Min(status => status.StatusID) - 1;

            Debug.Assert(maxID < previousMaxID);
            previousMaxID = maxID;

            searchResponse =
                await
                (from search in twitterCtx.Search
                 where search.Type == SearchType.Search &&
                       search.Query == searchTerm &&
                       search.Count == MaxSearchEntriesToReturn &&
                       search.MaxID == maxID &&
                       search.SinceID == sinceID
                 select search.Statuses)
                .SingleOrDefaultAsync();

            combinedSearchResults.AddRange(searchResponse);
        } while (searchResponse.Any());

        combinedSearchResults.ForEach(tweet =>
            Console.WriteLine(
                "\n  User: {0} ({1})\n  Tweet: {2}",
                tweet.User.ScreenNameResponse,
                tweet.User.UserIDResponse,
                tweet.Text));
    }

This approach seems like a lot of code, but really gives you more control over the search. e.g. you can examine tweets and determine how many times to query based on the contents of a tweet (like CreatedAt). You can wrap the query in a try/catch block to watch for HTTP 429 when you've exceeded your rate limit or twitter has a problem, allowing you to remember where you were and resume. You could also monitor twitterContext RateLimit properties to see if you're getting close and avoid an exception for HTTP 429 ahead of time. Any other technique to blindly read N tweets could force you to waste rate-limit and make your application less scalable.

  • Tip: Remember to save SinceID for the given search term, if you're saving tweets, to keep from re-reading the same tweets the next time you do a search with that search term.

For more info on the mechanics of this, read Working with Timelines in the Twitter docs.



回答2:

Just wanted to say that with Tweetinvi it would be as simple as :

// If you want to handle RateLimits
RateLimit.RateLimitTrackerOption = RateLimitTrackerOptions.TrackAndAwait;

var tweets = Search.SearchTweets(new TweetSearchParameters("#karne")
{
    MaximumNumberOfResults = 10000
    MaxId = 243982 // If you want to start at a specific point
});


回答3:

TweetInvi is even simpler now. All you need to do is:

var matchingTweets = Search.SearchTweets("#AutismAwareness");