i'd like to add a tag to a blogpost with a single sql statement.
say my tables would look as follows:
tags
+-------+-----------+
| tagid | tag |
+-------+-----------+
| 1 | news |
| 2 | top-story |
+-------+-----------+
tag2post
+----+--------+-------+
| id | postid | tagid |
+----+--------+-------+
| 0 | 322 | 1 |
+----+--------+-------+
the problem i'd like to solve is inserting a new tag, retrieve it's id and then inset this new id into the relation table in a single sql statement.
INSERT INTO tag2post (postid, tagid)
VALUES
(
332, # the post
IF (
(SELECT tagid FROM tags WHERE tag = 'new_tag'),
(SELECT tagid FROM tags WHERE tag = 'new_tag'),
# here is where i'd like to insert
# the new_tag and return it's id
'i am lost here'
)
)
You could try this, effectively doing an insert if it couldn't select it, and then retrieving the LAST_INSERT_ID() from the inserted row, but I seriously doubt it will work.
You can't do this as a single insert because inserts are atomic--that is, the ID isn't determined until the statement completes.
Wrap both statements in a transaction and you will get your ID, and atomicity.
I think you'd have to break it up. One way to do this is to add the
UNIQUE
constraint totag
and proceed as follows:and then
INSERT IGNORE
may not scale well, however. So an alternative is to doSELECT
, if the id is not there,INSERT
, then be sure to catch an exception here if another thread/process/service modified the table between yourSELECT
and now. In theINSERT
you doLAST_INSERT_ID()
and int he catch you repeat the initialSELECT
..Not awesome, but if concurrency is a requirement, you need to do this. Transactions will not alleviate the issue and if you have no
UNIQUE
constraint ontag
, you may end up with duplicateINSERT
s.If you use a non-auto increment field and do your own increment calculation, you can determine what the id will be before you insert the field, and specify the new ids manually.
Then you can build a string of statements for each tag2post entry using these ids seperated by ;s and you can send it to msyql to execute.
Instead of using an auto-increment ID column, use a GUID column. Then you can generate the GUID before you run the statement and do everything at once.
Even Jeff Atwood likes this approach, and you don't get a significant speed penalty for using 32-char strings instead of integers.
Also, to prevent tag duplication you should use the MD5 sum of the tag name as the tag ID.