I thought I had parsed the data correctly using jsonlite
& tidyjson
. However, I am noticing that only the data from the first page is being parsed. Please advice how I could parse all the pages correctly. The total number of pages are over 1300 -if I look at the json
output, so I think the data is available but not correctly parsed.
Note: I have used tidyjson
, but am open to using jsonlite
or any other library too.
library(dplyr)
library(tidyjson)
library(jsonlite)
req <- httr::GET("http://svcs.ebay.com/services/search/FindingService/v1?OPERATION-NAME=findItemsByKeywords&SERVICE-VERSION=1.0.0&SECURITY-APPNAME=xxxxxx&GLOBAL-ID=EBAY-US&RESPONSE-DATA-FORMAT=JSON&callback=_cb_findItemsByKeywords&REST-PAYLOAD&keywords=harry%20potter&paginationInput.entriesPerPage=100")
txt <- content(req, "text")
json <- sub("/**/_cb_findItemsByKeywords(", "", txt, fixed = TRUE)
json <- sub(")$", "", json)
data1 <- json %>% as.tbl_json %>%
enter_object("findItemsByKeywordsResponse") %>% gather_array %>% enter_object("searchResult") %>% gather_array %>%
enter_object("item") %>% gather_array %>%
spread_values(
ITEMID = jstring("itemId"),
TITLE = jstring("title")
) %>%
select(ITEMID, TITLE) # select only what is needed
############################################################
*Note: "paginationOutput":[{"pageNumber":["1"],"entriesPerPage":["100"],"totalPages":["1393"],"totalEntries":["139269"]}]
* &_ipg=100&_pgn=1"
No need for
tidyjson
. You will need to write another function/set of calls to get the total number of pages (it's over 1,400) to use the following, but that should be fairly straightforward. Try to compartmentalize your operations a bit more and use the full power ofhttr
when you can to parameterize things: