I am writing my own pipeline for Python scrapy:
from scrapy.exceptions import NotConfigured
from scrapy.exceptions import DropItem
import pymssql
from slybot.item import create_item_version
class SQLStore(object):
def __init__(self):
self.conn = pymssql.connect(host='XXXXXX', user='sa', password='1timep', database='DBSample')
self.cursor = self.conn.cursor()
#log data to json file
def process_item(self, item, spider):
try:
self.cursor.execute("INSERT INTO Movie(Description, Location,Title) VALUES (%s, %s, %s)", (item['Description'], item['Location'], item['Title']))
self.conn.commit()
except pymssql.Error, e:
print ("error")
return item
I am trying to insert the values into SQL server.
Below is my spider setting:
ITEM_PIPELINES = {'slybot.dupefilter.SQLStore' : 100}
It is working fine. And when i submit my spider in Scrapyd i am seeing the below log file
2015-01-19 16:07:57+0530 [scrapy] INFO: Enabled item pipelines: SQLStore
From the log file i am seeing that my spider is using the SQLStore
pipline.
But the values are not loaded into SQL server . I am able to see the content in the log files in the json format.
What went wrong. And what is the problem?
Can anyone please help me? Thanks.
The code is not properly indented.
process_item
is on the same level asSQLStore
class definition, hence it is not a method of a class and is never called. Indent it: