How to search twitter for keywords

2019-04-12 19:38发布

I am trying to build a service that performs keyword searches for multiple users in Twitter on a constant monitoring basis. There seems to be 5 different ways to accomplish this--all with their own drawbacks. I have gone through the Twitter and twitter4j documents and cannot find any other approaches.

  1. Use the Twitter REST API to perform searches (https://dev.twitter.com/docs/api/1/get/search). This API is result-limited: ask for too much and you will be limited. I do have to keep track of the last tweet read so I don't duplicate results. A timer is needed to poll the stream. If there are multiple search terms it is simple to make multiple calls.

  2. Search the public stream approach (https://dev.twitter.com/docs/streaming-apis/streams/public). While this is great for constant searching, Twitter only allows one connection per account and there are limits on how many terms can passed into Twitter. Definitely impossible for my use case

  3. Try to use User Streams for filtering. I did this but found that it was difficult to quickly determine if a tweet was from search or the user stream. Also, Twitter states that they will limit the number of user streams per IP address so this approach does not scale. (Twitter has been talking up something called SiteStreams, but it is a very limited beta without any documentation so it is not something I can consider).

  4. Go to a third party who is purchasing the entire firehose from Twitter (e.g. Datasift) and search the twitter stream there. This gets expensive--$3K/month for the base plan. Searching for a single word 24/7 costs ~$45/month)

My question for the community is "have I exhausted all possibilities"? If yes, then it appears to me that #1--using the REST API with a timer and tracking last found is the right approach. Does anyone disagree? If so, can you point me to the documentation ( or library) that would help me resolve this issue.

Thanks all

标签: twitter
3条回答
冷血范
2楼-- · 2019-04-12 20:19

Response from Twitter was to use #4--purchase access from vendor such as Datasift.

查看更多
家丑人穷心不美
3楼-- · 2019-04-12 20:24

I put together a nice JS fiddle that should answer all your questions when it comes to dealing with the Twitter API. The webapp grabs the trending locales, and allows you to drill down to the trending topics, and then see the Tweets within.

I also included a standard Twitter search submission box, so in a weird way, this is a barebones Tweetdeck client for you to examine. Also, to push the adaption of the new Jquery libraries, I have used 1.91 which utilities the new live.bind click event syntax.

Enjoy

http://jsfiddle.net/jdrefahl/5M3Gn/

function searchTwitter(query) {
$.ajax({
    url: 'http://search.twitter.com/search.json?' + jQuery.param(query),
    dataType: 'jsonp',
    success: function (data) {
        var tweets = $('#tweets');
        tweets.html('');
        for (res in data['results']) {
            tweets.append('<div>' + data['results'][res]['from_user'] + ' wrote: <p>' + data['results'][res]['text'] + '</p></div><br />');
        }
    }
});
}

$(document).ready(function () {

function getTrendsByID(id) {
    $.ajax({
        url: 'http://api.twitter.com/1/trends/' + id + '.json',
        dataType: 'jsonp',
        success: function (data) {
            $.each(data[0].trends, function (i) {
            });
        }
    });
};

function getLocales() {
    $.ajax({
        url: 'https://api.twitter.com/1/trends/available.json',
        dataType: 'jsonp',
        success: function (data) {
            var locales = $('ul#locales');
            locales.html('');
            $.each(data, function (i) {
                localeID[i] = data[i].woeid;
                $('ul#locales').append('<li>' + data[i].name + '</li>');
            });
        }
    });

};

function getTrends(id) {
    $.ajax({
        url: 'https://api.twitter.com/1/trends/' + id + '.json',
        dataType: 'jsonp',
        success: function (data) {
            var trends = $('ul#currentTrends');
            trends.html('');
            $.each(data[0].trends, function (i) {
                $('ul#currentTrends').append('<li>' + data[0].trends[i].name + '</li>');
            });
        }
    });
};

// Event Handlers
$(document).on("click", "#locales li", function () {
    var $this = $(this);
    var localesHdr = $('#currentTrendsCont h3');
    var tweets = $('#tweets');
    var trendsHdr = $('#tweetsCont h3');
    trendsHdr.html('');
    tweets.html('');
    localesHdr.html('');
    $('#currentTrendsCont h3').html($this.text());
    getTrends(localeID[$this.index()]);
});

$(document).on("click", "#currentTrends li", function () {
    var $this = $(this);
    var trendsHdr = $('#tweetsCont h3');
    trendsHdr.html('');
    $('#tweetsCont h3').html($this.text());
    var params = {
        q: $this.text(),
        rpp: 10
    };
    searchTwitter(params);
});

$('#submit').click(function () {
    var trendsHdr = $('#tweetsCont h3');
    var trends = $('#currentTrends');
    var local = $('#currentTrendsCont h3');
    local.html('');
    trendsHdr.html('');
    trends.html('');
    $('#tweetsCont h3').html('search query: '+$('#query').val());
    var params = {
        q: $('#query').val(),
        rpp: 10
    };
    searchTwitter(params);
});

// Globals
var localeID = new Array();

// Init!
getLocales();

});
查看更多
聊天终结者
4楼-- · 2019-04-12 20:24

How frequent do you want to search twitter, and what is the likely search volume (i.e. how many users, and how many keywords per user)?

Have you also considered an inbrowser-scraping tool? I.e. leave a browser running on a server that is kept up to date with search results, and develop a simple plugin for that browser that captures the data and posts it to your database/sends it somewhere to be processed?

查看更多
登录 后发表回答