Scrapy pipeline for MSSQL

I am writing my own pipeline for Python scrapy:

from scrapy.exceptions import NotConfigured
from scrapy.exceptions import DropItem
import pymssql

from slybot.item import create_item_version

class SQLStore(object):
  def __init__(self):
    self.conn = pymssql.connect(host='XXXXXX', user='sa', password='1timep', database='DBSample')
    self.cursor = self.conn.cursor()
    #log data to json file


def process_item(self, item, spider): 

    try:
        self.cursor.execute("INSERT INTO Movie(Description, Location,Title) VALUES (%s, %s, %s)", (item['Description'], item['Location'], item['Title']))
        self.conn.commit()

    except pymssql.Error, e:
        print ("error")

        return item

I am trying to insert the values into SQL server.

Below is my spider setting:

ITEM_PIPELINES = {'slybot.dupefilter.SQLStore' : 100}

It is working fine. And when i submit my spider in Scrapyd i am seeing the below log file

2015-01-19 16:07:57+0530 [scrapy] INFO: Enabled item pipelines: SQLStore

From the log file i am seeing that my spider is using the SQLStore pipline.

But the values are not loaded into SQL server . I am able to see the content in the log files in the json format.

What went wrong. And what is the problem?

Can anyone please help me? Thanks.

标签： python sql web-scraping scrapy

1条回答

Bombasti

2楼-- · 2019-05-25 11:34

The code is not properly indented. process_item is on the same level as SQLStore class definition, hence it is not a method of a class and is never called. Indent it:

import pymssql

from slybot.item import create_item_version


class SQLStore(object):
    def __init__(self):
        self.conn = pymssql.connect(host='XXXXXX', user='sa', password='1timep', database='DBSample')
        self.cursor = self.conn.cursor()

    def process_item(self, item, spider):
        try:
            self.cursor.execute("INSERT INTO Movie(Description, Location,Title) VALUES (%s, %s, %s)",
                                (item['Description'], item['Location'], item['Title']))
            self.conn.commit()
        except pymssql.Error, e:
            print ("error")

        return item

0人赞添加讨论(0) 举报

Scrapy pipeline for MSSQL

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间