Scrapy with a nested array

2019-04-09 01:36发布

I'm new to scrapy and would like to understand how to scrape on object for output into nested JSON. Right now, I'm producing JSON that looks like

[
{'a' : 1, 
'b' : '2',
'c' : 3},
]

And I'd like it more like this:

[
{ 'a' : '1',
'_junk' : [
     'b' : 2,
     'c' : 3]},
]

---where I put some stuff in _junk subfields to post-process later.

The current code under the parser definition file in my scrapername.py is...

item['a'] = x
item['b'] = y
item['c'] = z

And it seemed like

item['a'] = x
item['_junk']['b'] = y
item['_junk']['c'] = z

---might fix that, but I'm getting an error about the _junk key:

  File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py", line 49, in __getitem__
    return self._values[key]
exceptions.KeyError: '_junk'

Does this mean I need to change my items.py somehow? Currently I have:

class Website(Item):
    a = Field()
    _junk = Field()
    b = Field()
    c = Field()

1条回答
手持菜刀,她持情操
2楼-- · 2019-04-09 02:35

You need to create the junk dictionary before storing items in it.

item['a'] = x
item['_junk'] = {}
item['_junk']['b'] = y
item['_junk']['c'] = z
查看更多
登录 后发表回答