I'm trying to extract content from a website created by our company. I've created a table in MSSQL Server for Scrapy data. I've also set up Scrapy and configured Python to crawl & extract webpage data. My question is, how do I export the data crawled by Scrapy into my local MSSQL Server database?
This is Scrapy's code for extracting data:
import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
start_urls = [
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/',
]
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.css('small.author::text').extract_first(),
'tags': quote.css('div.tags a.tag::text').extract(),
}
I think the best thing to do is save the data to a CSV, and then load the CSV into your SQL Server table.
OR
You can use
pymssql
module to send data to SQL Server, something like this :Also, you will need to add
'spider_name.pipelines.DataPipeline' : 300
toITEM_PIPELINES
dict in setting.