In scrapy documentation there is this information:
Activating an Item Pipeline component
To activate an Item Pipeline component you must add its class to the ITEM_PIPELINES setting, like in the following example:
ITEM_PIPELINES = { 'myproject.pipelines.PricePipeline': 300, 'myproject.pipelines.JsonWriterPipeline': 800, }
The integer values you assign to classes in this setting determine the order they run in- items go through pipelines from order number low to high. It’s customary to define these numbers in the 0-1000 range.
I do not understand the last paragraph, mainly "determine the order they run in- items go through pipelines from order number low to high", can you explain in other words? that numbers are chosen because of what? in the range is 0-1000 how to choose the values?
Since a dictionary in Python is an unordered collection and
ITEM_PIPELINES
has to be a dictionary (as a lot of other settings, like, for example,SPIDER_MIDDLEWARES
), you need to, somehow, define an order in which pipelines are applied. This is why you need to assign a number from 0 to 1000 to each pipeline you define.FYI, if you look into Scrapy source, you'll find
build_component_list()
function which is called for each setting likeITEM_PIPELINES
- it makes a list (ordered collection) out of the dictionary you define inITEM_PIPELINES
using dictionary values for sorting:From docs