I have been trying to understand the concept of using BaseSpider and CrawlSpider in web scrapping. I have read the docs. But there is no mention on BaseSpider. It would be really helpful to me if someone explain the differences between BaseSpider and CrawlSpider.
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
BaseSpider
is something existed before and now is deprecated (since 0.22) - usescrapy.Spider
instead:scrapy.Spider
is the simplest spider that would, basically, visit the URLs defined instart_urls
or returned bystart_requests()
.Use
CrawlSpider
when you need a "crawling" behavior - extracting the links and following them: