I need to set the referer url, before scraping a site, the site uses refering url based Authentication, so it does not allow me to login if the referer is not valid.
Could someone tell how to do this in Scrapy?
I need to set the referer url, before scraping a site, the site uses refering url based Authentication, so it does not allow me to login if the referer is not valid.
Could someone tell how to do this in Scrapy?
Just set Referer url in the Request headers
Example:
return Request(url=your_url, headers={'Referer':'http://your_referer_url'})
Override
BaseSpider.start_requests
and create there your custom Request passing it yourreferer
header.If you want to change the referer in your spider's request, you can change
DEFAULT_REQUEST_HEADERS
in the settings.py file:You should do exactly as @warwaruk indicated, below is my example elaboration for a crawl spider:
This should generate following logs in your terminal:
Will work same with BaseSpider. In the end start_requests method is BaseSpider method, from which CrawlSpider inherits from.
Documentation explains more options to be set in Request apart from headers, such as: cookies , callback function, priority of the request etc.