I have an additional question related to post:
Composite primary keys in databases
Please check out the post over there (otherwise I just need to repeat it).
Now my question is:
If I go for an ID autoincrement as primary key (as I accepted and which allows me to again reference the current table through this key), how can I assure that a combination between User_ID and Admin_ID (both FK's) can only exist once (is unique) on insert?
It is a many-to-many relationship.
It could be done in the programming of the front end (check for existing record by select), however my feelings tell me that this is not the best way and I wonder if I can directly put the restriction in the back end.
I would logically add the FK's to the primary key, but then I'm back to the composite key and this is what I was generally advised rather not to use.
What is the proper way of doing this?
Thanks for your help.
how can I assure that a combination between User_ID and Admin_ID (both FK's) can only exist once (is unique) on insert?
Create a composite key.
It could be done in the programming of the front end (check for existing record by select)
It couldn't, unless you are the only client accessing the table.1
In a real, concurrent environment, you can never know if another transaction inserted the same value (as you are trying to insert) after your SELECT but before your INSERT.
And even if you were the only one accessing the database, you'd need an index for efficient execution of SELECT anyway. So why not let the DBMS utilize this index for a key?
I would logically add the FK's to the primary key, but then I'm back to the composite key and this is what I was generally advised rather not to use.
Wrong advice. If a column or a combination of columns has to be unique, you have to create a key.2 You cannot skip creating a key that enforces the correctness of your data just because you have another (surrogate) key.
Surrogates cannot generally replace natural keys, they can only be added.3 So the question becomes: is the additional overhead of the surrogate worth it? Sometimes it is, sometimes it isn't, but there are no cut-and-dry rules here.
1 Or are willing to lock the whole table, destroying the scalability in the process.
2 Though it doesn't necessarily need to be primary.
3 What they typically "replace" is the natural key's role as a primary key, but the natural key still continues to live as alternate key.
I would go for a composite key
If I really need an auto-increment primary key, then I'll create a unique index on the two foreign key columns
Side note: The advantage of using a composite key is that when using an ORM tool like Entity Framework, it automatically recognizes it as a many to many relationship, and abstracting away the additional intersection table into just a relationship.
Consider the following (hypothetical) schema. Would you add a surrogate key to the "chessboard" table ? The values for {xxx,yyy,pc} are constrained, either by "restricted" domains, or by the PK+FK constraints.
In which cases would adding a surrogate key (for {xxx,yyy} help?
(additional constraints (such as: no more than one King per Color ...) would be necessary for a real chess game, (but part of) the business rules (such as valid moves ...) would be handled by the "application logic" anyway)
-- this is Postgres-specific:
-- create a schema to play in
DROP SCHEMA chess CASCADE ;
CREATE SCHEMA chess ;
SET search_path='chess' ;
-- Domain with only values A-H to three allowed.
CREATE DOMAIN chess_column
AS CHAR(1) NOT NULL
check (value >= 'A' AND value <= 'H')
;
-- Domain with only values 1-8 allowed.
CREATE DOMAIN chess_row
AS INTEGER NOT NULL
check (value >= 1 AND value <= 8)
;
-- Table with only valid pieces
CREATE TABLE chess_piece
( id INTEGER NOT NULL PRIMARY KEY
, pname varchar
) ;
INSERT INTO chess_piece(id,pname) VALUES
( -6, 'Black King' ) , ( -5, 'Black Queen' ) , ( -4, 'Black Rook' )
, ( -3, 'Black Bishop' ) , ( -2, 'Black Knight' ) , ( -1, 'Black Pawn' )
, ( 6, 'White King' ) , ( 5, 'White Queen' ) , ( 4, 'White Rook' )
, ( 3, 'White Bishop' ) , ( 2, 'White Knight' ) , ( 1, 'White Pawn' )
;
CREATE TABLE chessboard
( xxx chess_column
, yyy chess_row
, pc INTEGER NOT NULL REFERENCES chess_piece(id)
, PRIMARY KEY (xxx,yyy)
);
-- Too lazy to enter the entire board
-- ; only put a White Pawn at E2
INSERT INTO chessboard(xxx,yyy,pc)
SELECT 'E', 2, p.id
FROM chess_piece p
WHERE p.pname = 'White Pawn';
;
-- Shift the pawn
UPDATE chessboard b
SET yyy = 4
FROM chess_piece p
WHERE b.pc = p.id
AND p.pname = 'White Pawn';
AND b.xxx = 'E' AND b.yyy = 2
;
-- Try to put a piece outside the board
\echo Try put a piece outside the board
INSERT INTO chessboard(xxx,yyy,pc)
SELECT 'I', 2, p.id
FROM chess_piece p
WHERE p.pname = 'Black Pawn';
;
-- add a non-existing piece
\echo add a non-existing piece
INSERT INTO chessboard(xxx,yyy,pc)
VALUES( 'H', 1, 42)
;
-- Position is already occupied
\echo Position is already occupied
INSERT INTO chessboard(xxx,yyy,pc)
SELECT 'E', 4, p.id
FROM chess_piece p
WHERE p.pname = 'Black Pawn';
;
Go with the ID. I strongly agree with the answer on the other page. For quick-and-dirty applications, a composite key is fine. However, I generally put auto-incremented ids into new tables that I create, unless they are static tables used for reporting.
For your specific question, there are at least four answers that I can think of off-hand:
- Implement a constraint to so the two fields are never duplicated.
- Create a unique index on the two columns.
- Implement a trigger on the table to check for duplicates.
- Do inserts via a stored procedure that checks for validity on the data.
My preference is (4), in combination with (1) or (2). I find that controlling inserts through stored procedures gives me a lot of flexibility, particularly when I want to log or debug problems. That said, I am not usually working with high-volume transactional systems, where reducing overhead is paramount.
There is one advantage to an auto-incrementing id missed on the other answer. The following query:
select *
from t
order by 1 desc
Returns the most recently added records, assuming the id is the first column (as it is in all my tables). Just the ability to see the most recently inserted records is sufficient for me to use an id.