I am keen to get a list of usernames and fullnames names from a specific twitter list using R. I could not see a function in any package but this code works
library(XML)
library(httr)
url.name <- "https://twitter.com/TwitterUK/lists/premier-league-players/members"
url.get=GET(url.name)
url.content=content(url.get, as="text")
pagehtml <- htmlParse(url.content)
screenNames <-xpathSApply(pagehtml, '//*/span[@class="username js-action-profile-name"]',xmlValue)
realName <- xpathSApply(pagehtml, '//*/strong[@class="fullname js-action-profile-name"]',xmlValue)
However, it only provides the first 20 values (? what appears on screen) whilst the list is much longer
If there is an rvest solution, this would also be welcome
cheers
The solution from Molx does not seem to work any more. The problem seems to lie in
This URL does not seem valid, for any twlist or twowner that I tried. EDIT : the problem comes from the authentication I think as I get
I think I'm authenticated with this
Where does the problem come from ?
EDIT : When I enter
get_oauth_sig()
I get the result belowIs this normal ?
The solution from Molx does not seem to work any more. The problem seems to lie in
This URL does not seem valid, for any twlist or twowner that I tried. EDIT : the problem comes from the authentication I think as I get
I think I'm authenticated with this
Where does the problem come from ?
EDIT : When I enter
get_oauth_sig()
I get the result belowIs this normal ?
EDIT : I solve the problem by replacing POST by GET
If you want to work with R and twitter, you should take a look at the
twitteR
package. It doesn't have a function to retrieve the information you want, but we can take advantage of its internal functions to use OAuth, and then send the correct API call. The advantage of using API calls is that you don't rely on parsing the HTML page, you're actually doing what developers are supposed to do.The code below assumes you have already authenticated using
setup_twitter_oauth()
, you can find tutorials on this easily, since it's the package basics. Once authenticated, let's load the packages we need:Now to do the API call, we'll use
POST
. The URL has aslug
parameter which is the twitter list name, and aowner_screen_name
parameter which is the Twitter Account owner of the list. We'll use internaltwitteR:::get_oauth_sig()
to authenticate the call.This returns a JSON response which we can read using
fromJSON
:Now, we have a list where each element is the Twitter data of one Twitter-list member. To extract their names and user_names:
Which are:
Now the best part of this code is that it opens up pretty much the entire twitter API from R, as an already authenticated request. You can check the response list and sublists for all the available information on each query.