I'm using the package RCurl to download some prices from a website in Brazil, but in order to load the data I must first choose a city from a form.
The website is: "http://www.muffatosupermercados.com.br/Home.aspx"
and I want the prices from CURITIBA, id=53.
I'm trying to use the solution provided in this post: "How do I use cookies with RCurl?"
And this is my code:
library("RCurl")
library("XML")
#Set your browsing links
loginurl = "http://www.muffatosupermercados.com.br"
dataurl = "http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"
#Set user account data and agent
pars=list(
id = "53"
)
agent="Mozilla/5.0" #or whatever
#Set RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookies.txt", useragent = agent, followlocation =TRUE, curl=curl)
#Also if you do not need to read the cookies.
#curlSetOpt( cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)
#Post login form
html=postForm(loginurl, .params = pars, curl=curl)
#Go wherever you want
html=getURL(dataurl, curl=curl)
C1 <- htmlParse(html, asText=TRUE, encoding="UTF-8")
Preco <- C1 %>% html_nodes(xpath = "//li[@class='preco']") %>% html_text(xmlValue, trim = TRUE)
But when I run the code I only get the page behind the form, not the intended page:
"http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"
I have also tried to play with cookies, but with no luck.
Does anyone have an idea on how to submit this form and load the correct page?
tks in advance...