RCurl - submit a form and load a page

2019-07-18 13:21发布

问题:

I'm using the package RCurl to download some prices from a website in Brazil, but in order to load the data I must first choose a city from a form.

The website is: "http://www.muffatosupermercados.com.br/Home.aspx"

and I want the prices from CURITIBA, id=53.

I'm trying to use the solution provided in this post: "How do I use cookies with RCurl?"

And this is my code:

    library("RCurl")
    library("XML")

    #Set your browsing links 
    loginurl = "http://www.muffatosupermercados.com.br"
    dataurl  = "http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"

    #Set user account data and agent
    pars=list(
            id = "53"
    )
    agent="Mozilla/5.0" #or whatever 

    #Set RCurl pars
    curl = getCurlHandle()
    curlSetOpt(cookiejar="cookies.txt",  useragent = agent, followlocation =TRUE, curl=curl)
    #Also if you do not need to read the cookies. 
    #curlSetOpt(  cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)

    #Post login form
    html=postForm(loginurl, .params = pars, curl=curl)

    #Go wherever you want
    html=getURL(dataurl, curl=curl)
    C1 <- htmlParse(html, asText=TRUE, encoding="UTF-8") 
    Preco <- C1 %>% html_nodes(xpath = "//li[@class='preco']") %>% html_text(xmlValue, trim = TRUE)

But when I run the code I only get the page behind the form, not the intended page:

"http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"

I have also tried to play with cookies, but with no luck.

Does anyone have an idea on how to submit this form and load the correct page?

tks in advance...

标签: xml r rcurl rvest