Scrapy spider not found error

2020-05-21 07:57发布

This is Windows 7 with python 2.7

I have a scrapy project in a directory called caps (this is where scrapy.cfg is)

My spider is located in caps\caps\spiders\campSpider.py

I cd into the scrapy project and try to run

scrapy crawl campSpider -o items.json -t json

I get an error that the spider can't be found. The class name is campSpider

...
    spider = self.crawler.spiders.create(spname, **opts.spargs)
  File "c:\Python27\lib\site-packages\scrapy-0.14.0.2841-py2.7-win32.egg\scrapy\spidermanager.py", l
ine 43, in create
    raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: campSpider'

Am I missing some configuration item?

标签: python scrapy
19条回答
家丑人穷心不美
2楼-- · 2020-05-21 08:39

Make sure you have set the "name" property of the spider. Example:

class campSpider(BaseSpider):
   name = 'campSpider'

Without the name property, the scrapy manager will not be able to find your spider.

查看更多
戒情不戒烟
3楼-- · 2020-05-21 08:39

Also make sure that your project is not called scrapy! I made that mistake and renaming it fixed the problem.

查看更多
我欲成王,谁敢阻挡
4楼-- · 2020-05-21 08:40

I also had this problem,and it turned out to be rather small. Be sure your class inherits from scrapy.Spider

my_class(scrapy.Spider):
查看更多
我欲成王,谁敢阻挡
5楼-- · 2020-05-21 08:41

For anyone who might have the same problem, not only you need to set the name of the spider and check for SPIDER_MODULES and NEWSPIDER_MODULE in your scrapy settings, if you are running a scrapyd service, you also need to restart in order to apply any change you have made

查看更多
Ridiculous、
6楼-- · 2020-05-21 08:43

You have to give a name to your spider.

However, BaseSpider is deprecated, use Spider instead.

from scrapy.spiders import Spider
class campSpider(Spider):
   name = 'campSpider'

The project should have been created by the startproject command:

scrapy startproject project_name

Which gives you the following directory tree:

project_name/
    scrapy.cfg            # deploy configuration file

    project_name/             # project's Python module, you'll import your code from here
        __init__.py

        items.py          # project items file

        pipelines.py      # project pipelines file

        settings.py       # project settings file

        spiders/          # a directory where you'll later put your spiders
            __init__.py
            ...

Make sure that settings.py has the definition of your spider module. eg:

BOT_NAME = 'bot_name' # Usually equals to your project_name 

SPIDER_MODULES = ['project_name.spiders']
NEWSPIDER_MODULE = 'project_name.spiders'

You should have no problems to run your spider locally or on ScrappingHub.

查看更多
聊天终结者
7楼-- · 2020-05-21 08:43

Also, it is possible that you have not deployed your spider. SO first use "scrapyd" to up the server and then use "scrapyd-deploy" to deploy and then run the command.

查看更多
登录 后发表回答