I output the scraped data in json format. Default scrapy exporter outputs list of dict in json format. Item type looks like:
[{"Product Name":"Product1", "Categories":["Clothing","Top"], "Price":"20.5", "Currency":"USD"},
{"Product Name":"Product2", "Categories":["Clothing","Top"], "Price":"21.5", "Currency":"USD"},
{"Product Name":"Product3", "Categories":["Clothing","Top"], "Price":"22.5", "Currency":"USD"},
{"Product Name":"Product4", "Categories":["Clothing","Top"], "Price":"23.5", "Currency":"USD"}, ...]
But I want to export the data in a specific format like this:
{
"Shop Name":"Shop 1",
"Location":"XXXXXXXXX",
"Contact":"XXXX-XXXXX",
"Products":
[{"Product Name":"Product1", "Categories":["Clothing","Top"], "Price":"20.5", "Currency":"USD"},
{"Product Name":"Product2", "Categories":["Clothing","Top"], "Price":"21.5", "Currency":"USD"},
{"Product Name":"Product3", "Categories":["Clothing","Top"], "Price":"22.5", "Currency":"USD"},
{"Product Name":"Product4", "Categories":["Clothing","Top"], "Price":"23.5", "Currency":"USD"}, ...]
}
Please advice me any solution. Thank you.
I was trying to export pretty printed JSON and this is what worked for me.
I created a pipeline that looked like this:
It's similar to the example from the scrapy docs https://doc.scrapy.org/en/latest/topics/item-pipeline.html except it prints each JSON property indented and on a new line.
See the part about pretty printing here https://docs.python.org/2/library/json.html
This is well documented at scrapy web page here.
This will create a json file containing your items.