What is the correct form of work with cookies in s

I'm very newbie,I am working with scrapy in a web that use cookies, This is a problem for me , because I can obtain data the a web without cookies but obtain the data of a web with cookies is dificult for me. I have this code structure

class mySpider(BaseSpider):
    name='data'
    allowed_domains =[]
    start_urls =["http://...."]

def parse(self, response):
    sel = HtmlXPathSelector(response)
    items = sel.xpath('//*[@id=..............')

    vlrs =[]

    for item in items:
        myItem['img'] = item.xpath('....').extract()
        yield myItem

This is fine, I can obtain fine the data without cookies using this code structure I found it as I can work with cookies, in this url, but I do not understand where I should put this code to then be able to get the data using xpath

I'm testing this code

request_with_cookies = Request(url="http://...",cookies={'country': 'UY'})

but I don't know as I can work or where put this code, I put this code into the function parse, for obtain the data

def parse(self, response):
    request_with_cookies = Request(url="http://.....",cookies={'country':'UY'})

    sel = HtmlXPathSelector(request_with_cookies)
    print request_with_cookies

I try of use XPath with this new url with cookies , for later print this new data scraping I thought it was like working with an url without cookies but when I run this I have a mistake because 'Request' object has no attribute 'body_as_unicode' What would be the proper way to work with these cookies, I'm a little lost Thank you very much.

标签： python cookies xpath scrapy scrapy-spider

1条回答

Explosion°爆炸

2楼-- · 2019-07-02 10:40

You are very close! The contract for the parse() method is that it yields (or returns an iterable) of Items, Requests, or a mix of both. In your case, all you should have to do is

yield request_with_cookies

and your parse() method will be run again with a Response object produced from requesting that URL with those cookies.

http://doc.scrapy.org/en/latest/topics/spiders.html?highlight=parse#scrapy.spider.Spider.parse http://doc.scrapy.org/en/latest/topics/request-response.html

0人赞添加讨论(0) 举报

What is the correct form of work with cookies in s

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间