Scrapy spider not found error

This is Windows 7 with python 2.7

I have a scrapy project in a directory called caps (this is where scrapy.cfg is)

My spider is located in caps\caps\spiders\campSpider.py

I cd into the scrapy project and try to run

scrapy crawl campSpider -o items.json -t json

I get an error that the spider can't be found. The class name is campSpider

...
    spider = self.crawler.spiders.create(spname, **opts.spargs)
  File "c:\Python27\lib\site-packages\scrapy-0.14.0.2841-py2.7-win32.egg\scrapy\spidermanager.py", l
ine 43, in create
    raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: campSpider'

Am I missing some configuration item?

标签： python scrapy

19条回答

Animai°情兽

2楼-- · 2020-05-21 08:23

If you are following the tutorial from https://docs.scrapy.org/en/latest/intro/tutorial.html

Then do something like:

$ sudo apt install python-pip
$ pip install Scrapy
(logout, login)
$ cd
$ scrapy startproject tutorial
$ vi ~/tutorial/tutorial/spiders/quotes_spider.py
$ cd ~/tutorial/tutorial
$ scrapy crawl quotes

The error happens if you try to create the spiders directory yourself under ~/tutorial

0人赞添加讨论(0) 举报

混吃等死

3楼-- · 2020-05-21 08:24

Check indentation too, the class for my spider was indented one tab. Somehow that makes the class invalid or something.

0人赞添加讨论(0) 举报

别忘想泡老子

4楼-- · 2020-05-21 08:27

I had the same issue. When i was using "scrapy list" in cmd the command listed the spider name i was getting the error for, in the list, but while i tried to run it with scrapy crawl SpiderName.py, i used to get Scrapy spider not found error. I have used this spider before and everything was fine with it. So i used the secret weapon, i restarted my system and the issue was resolved :D

0人赞添加讨论(0) 举报

▲ chillily

5楼-- · 2020-05-21 08:31

Have you set up the SPIDER_MODULES setting?

SPIDER_MODULES

Default: []

A list of modules where Scrapy will look for spiders.

Example:

SPIDER_MODULES = ['mybot.spiders_prod', 'mybot.spiders_dev']

0人赞添加讨论(0) 举报

beautiful°

6楼-- · 2020-05-21 08:31

Name attribute in CrawlSpider class defines the spider name and this name is used in command line for calling the spider to work.

import json

from scrapy import Spider
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.linkextractor import LinkExtractor

class NameSpider(CrawlSpider):
    name = 'name of spider'
    allowed_domains = ['allowed domains of web portal to be scrapped']
    start_urls = ['start url of of web portal to be scrapped']

    custom_settings = {
        'DOWNLOAD_DELAY': 1,
        'USER_AGENT': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'
    }

    product_css = ['.main-menu']
    rules = [
        Rule(LinkExtractor(restrict_css=product_css), callback='parse'),
    ]

    def parse(self, response):
        //implementation of business logic

0人赞添加讨论(0) 举报

劫难

7楼-- · 2020-05-21 08:31

In my case, i set 'LOG_STDOUT=True', and scrapyd can not return the results to json response when you are looking for your spiders with '/listspiders.json'. And instead of that, the results are being printed to the log files you set at scrapyd's default_scrapyd.conf file. So, I changed the settings as this, and it worked well.

LOG_STDOUT = False

0人赞添加讨论(0) 举报

1 2 3 4 下一页

Scrapy spider not found error

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间