Why don't duplicates in a relationship violate

2019-07-15 06:10发布

With the below models, why does the following interactive succeed in adding duplicate associations to a relationship during the same transaction? I expected (and need) to it fail with the UniqueConstraint placed on the association table.

Models:

from app import db # this is the access to SQLAlchemy
class User(db.Model):

    id = db.Column(db.Integer, primary_key=True)

    sz_shirt_dress_sleeve = db.relationship(
        'SizeKeyShirtDressSleeve',
        secondary=LinkUserSizeShirtDressSleeve,
        backref=db.backref('users', lazy='dynamic'),
        order_by="asc(SizeKeyShirtDressSleeve.id)")

class SizeKeyShirtDressSleeve(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    size = db.Column(db.Integer)

    def __repr__(self):
        return 'Dress shirt sleeve size: %r' % self.size

LinkUserSizeShirtDressSleeve = db.Table(
    'link_user_size_shirt_dress_sleeve',
    db.Column(
        'size_id',
        db.Integer,
        db.ForeignKey('size_key_shirt_dress_sleeve.id'), primary_key=True),
    db.Column(
        'user_id',
        db.Integer,
        db.ForeignKey('user.id'), primary_key=True),
    db.UniqueConstraint('size_id', 'user_id', name='uq_association')
)

Because of the UniqueConstraint on the association table, I expected this interactive session to cause an IntegrityError. It doesn't and allows me to associate the same size twice:

>>> from app.models import User, SizeKeyShirtDressSleeve
>>> db.session.add(User(id=8))
>>> db.session.commit()
>>> u = User.query.filter(User.id==8).one()
>>> u
<User id: 8, email: None, password_hash: None>
>>> u.sz_shirt_dress_sleeve
[]
>>> should_cause_error = SizeKeyShirtDressSleeve.query.first()
>>> should_cause_error
Dress shirt sleeve size: 3000
>>> u.sz_shirt_dress_sleeve.append(should_cause_error)
>>> u.sz_shirt_dress_sleeve.append(should_cause_error)
>>> u.sz_shirt_dress_sleeve
[Dress shirt sleeve size: 3000, Dress shirt sleeve size: 3000]
>>> db.session.commit()
>>> 

Wait, what? Isn't that relationship representative of what is in my association table? I guess I should verify that:

(immediately after, same session)

>>> from app.models import LinkUserSizeShirtDressSleeve as Sleeve
>>> db.session.query(Sleeve).filter(Sleeve.c.user_id==8).all()
[(1, 8)]
>>>

So u.sz_shirt_dress_sleeve wasn't accurately representing the state of the association table. ...Okay. But I need it to. In fact, I do know it will fail if I do try to add another should_cause_error object to the relationship:

>>> u.sz_shirt_dress_sleeve.append(should_cause_error)
>>> db.session.commit()
# huge stack trace
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: link_user_size_shirt_dress_sleeve.size_id, link_user_size_shirt_dress_sleeve.user_id [SQL: 'INSERT INTO link_user_size_shirt_dress_sleeve (size_id, user_id) VALUES (?, ?)'] [parameters: (1, 8)] (Background on this error at: http://sqlalche.me/e/gkpj)
>>> 

Great! Okay, so things I'm inferring: 1) It's possible to have duplicate items in the relationship list. 2) It's possible for the relationship list to not accurately reflect the state of the association table it is responsible for. 3) The UniqueConstraint works ...as long as I continue interacting with the relationship in separate transactions (punctuated by session.commit()).

Questions: Are 1), 2), or 3) incorrect? And how can I prevent duplicate items being present in my relationship list inside the same transaction?

1条回答
一夜七次
2楼-- · 2019-07-15 06:56

Those three things are all correct. 3) should be qualified: the UniqueConstraint always works in the sense that your database will never be inconsistent; it just doesn't give you an error unless the relationship you're adding is already flushed.

The fundamental reason this happens is an impedance mismatch between an association table in SQL and its representation in SQLAlchemy. A table in SQL is a multiset of tuples, so with that UNIQUE constraint, your LinkUserSizeShirtDressSleeve table is a set of (size_id, user_id) tuples. On the other hand, the default representation of a relationship in SQLAlchemy an ordered list of objects, but it imposes some limitations on the way it maintains this list and the way it expects you to interact with this list, so it behaves more like a set in some ways. In particular, it silently ignores duplicate entries in your association table (if you happen to not have a UNIQUE constraint), and it assumes that you never add duplicate objects to this list in the first place.

If this is a problem for you, just make the behavior more in line with SQL by using collection_class=set on your relationship. If you want an error to be raised when you add duplicate entries into the relationship, create a custom collection class based on set that fails on duplicate adds. In some of my projects, I've resorted to monkey-patching the relationship constructor to set collection_class=set on all of my relationships to make this less verbose.

Here's how I would such a custom collection class:

class UniqueSet(set):
    def add(self, el):
        if el in self:
            raise ValueError("Value already exists")
        super().add(el)
查看更多
登录 后发表回答