MongoEngine - Pull a reference from a ListField, b

2020-07-18 07:38发布

问题:

I would like to remove some references from a ListField(ReferenceField), solely based on their value.

I store information about images in the following Model:

class ImageUrl(Document):
    src = UrlField()
    counter = IntField()
    deleted = BooleanField()

We store the ids of the images encountered on a page in an EmbeddedDocument called Webpage:

class Webpage(EmbeddedDocument):
    image_list = ListField(ReferenceField(ImageUrl))
    ...

Finally, the Website model is embedded into a RawData model:

class RawData(Document):
    ...
    webpage = EmbeddedDocumentField(Webpage)

I would like to remove references to ImageUrl records from RawData records, based some some of their attributes (eg: counter value exceeding 1), and then set the deleted attribute of these ImageUrl records to True.

I'm doing:

images = ImageUrl.objects((Q(deleted=False) & Q(counter__gt=1)).all()
for image in images:
    # all RadData records containing the image in their image list
    for rdata in RawData.objects(webpage__image_list__in=[image.id]:
        # remove image from the image_list
        RawData.objects(id=rdata.id).update_one(pull__webpage__image_list=image.id)
    # set 'deleted=True' on the ImageUrl record
    ImageUrl.objects(id=image.id).update_one(set__deleted=True)

The pull operation raises the following error: OperationError: Update failed [Cannot apply $pull/$pullAll modifier to non-array].

As i understood it from http://docs.mongodb.org/manual/reference/operator/pull/#_S_pull or How to remove a item from a list(ListField) by id in MongoEngine?, I need to specify the key of the the array from which I want to remove the value. However, in my case, I'd like to remove a value from a list... How should I do that?

Thank you very much for your time!

回答1:

The way the positional operator works is it allows you to query for a value in a list and then do an action on the first instance of that value, usually an update. $pull will remove all instances from the list and this is what you want.

In mongoengine with references you can just pass the instance object eg:

for rdata in RawData.objects(webpage__image_list=image):
    # remove image from the image_list
    rdata.update_one(pull__webpage__image_list=image)

I cleaned up code, removed the duplicate queries - as you already have rdata no need to refind that document!

OperationError: Update failed [Cannot apply $pull/$pullAll modifier to non-array] This means that you are trying to pull data which requires an array and there is a document where image_list isn't actually an array. This is probably caused because on disk you have a document where image_list is not actually a list. If you put a try except block you can have a look at the document that fails to see if that is the case and if so it will need manually migrating.