How to 'insert if not exists' in MySQL?

2018-12-31 00:17发布

I started by googling, and found this article which talks about mutex tables.

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?

Does a unique constraint on a field guarantee the insert will fail if it's already there?

It seems that with merely a constraint, when I issue the insert via php, the script croaks.

9条回答
低头抚发
2楼-- · 2018-12-31 00:25
REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

If the record exists, it will be overwritten; if it does not yet exist, it will be created.

查看更多
无与为乐者.
3楼-- · 2018-12-31 00:28
INSERT INTO `table` (value1, value2) 
SELECT 'stuff for value1', 'stuff for value2' FROM `table` 
WHERE NOT EXISTS (SELECT * FROM `table` 
      WHERE value1='stuff for value1' AND value2='stuff for value2') 
LIMIT 1 

Alternatively, the outer SELECT statement can refer to DUAL in order to handle the case where the table is initially empty:

INSERT INTO `table` (value1, value2) 
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
WHERE NOT EXISTS (SELECT * FROM `table` 
      WHERE value1='stuff for value1' AND value2='stuff for value2') 
LIMIT 1 
查看更多
浅入江南
4楼-- · 2018-12-31 00:30

use INSERT IGNORE INTO table

see http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html

there's also INSERT … ON DUPLICATE KEY UPDATE syntax, you can find explanations on dev.mysql.com


Post from bogdan.org.ua according to Google's webcache:

18th October 2007

To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality.

There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE.

Imagine we have a table:

CREATE TABLE `transcripts` (
`ensembl_transcript_id` varchar(20) NOT NULL,
`transcript_chrom_start` int(10) unsigned NOT NULL,
`transcript_chrom_end` int(10) unsigned NOT NULL,
PRIMARY KEY (`ensembl_transcript_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things: 1) repeated executions of the pipeline will not destroy our database, and 2) repeated executions will not die due to ‘duplicate primary key’ errors.

Method 1: using REPLACE

It’s very simple:

REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = ‘ENSORGT00000000001′,
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

If the record exists, it will be overwritten; if it does not yet exist, it will be created. However, using this method isn’t efficient for our case: we do not need to overwrite existing records, it’s fine just to skip them.

Method 2: using INSERT IGNORE Also very simple:

INSERT IGNORE INTO `transcripts`
SET `ensembl_transcript_id` = ‘ENSORGT00000000001′,
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

Here, if the ‘ensembl_transcript_id’ is already present in the database, it will be silently skipped (ignored). (To be more precise, here’s a quote from MySQL reference manual: “If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted.”.) If the record doesn’t yet exist, it will be created.

This second method has several potential weaknesses, including non-abortion of the query in case any other problem occurs (see the manual). Thus it should be used if previously tested without the IGNORE keyword.

There is one more option: to use INSERT … ON DUPLICATE KEY UPDATE syntax, and in the UPDATE part just do nothing do some meaningless (empty) operation, like calculating 0+0 (Geoffray suggests doing the id=id assignment for the MySQL optimization engine to ignore this operation). Advantage of this method is that it only ignores duplicate key events, and still aborts on other errors.

As a final notice: this post was inspired by Xaprb. I’d also advise to consult his other post on writing flexible SQL queries.

查看更多
呛了眼睛熬了心
5楼-- · 2018-12-31 00:40

Here is a PHP function that will insert a row only if all the specified columns values don't already exist in the table.

  • If one of the columns differ, the row will be added.

  • If the table is empty, the row will be added.

  • If a row exists where all the specified columns have the specified values, the row won't be added.

    function insert_unique($table, $vars)
    {
      if (count($vars)) {
        $table = mysql_real_escape_string($table);
        $vars = array_map('mysql_real_escape_string', $vars);
    
        $req = "INSERT INTO `$table` (`". join('`, `', array_keys($vars)) ."`) ";
        $req .= "SELECT '". join("', '", $vars) ."' FROM DUAL ";
        $req .= "WHERE NOT EXISTS (SELECT 1 FROM `$table` WHERE ";
    
        foreach ($vars AS $col => $val)
          $req .= "`$col`='$val' AND ";
    
        $req = substr($req, 0, -5) . ") LIMIT 1";
    
        $res = mysql_query($req) OR die();
        return mysql_insert_id();
      }
    
      return False;
    }
    

Example usage :

<?php
insert_unique('mytable', array(
  'mycolumn1' => 'myvalue1',
  'mycolumn2' => 'myvalue2',
  'mycolumn3' => 'myvalue3'
  )
);
?>
查看更多
唯独是你
6楼-- · 2018-12-31 00:46

Any simple constraint should do the job, if an exception is acceptable. Examples :

  • primary key if not surrogate
  • unique constraint on a column
  • multi-column unique constraint

Sorry is this seems deceptively simple. I know it looks bad confronted to the link you share with us. ;-(

But I neverleless give this answer, because it seem to fill your need. (If not, it may trigger your updating your requirements, which would be "a Good Thing"(TM) also).

Edited: If an insert would break the database unique constraint, an exception is throw at the database level, relayed by the driver. It will certainly stop your script, with a failure. It must be possible in PHP to adress that case ...

查看更多
临风纵饮
7楼-- · 2018-12-31 00:46

There are several answers that cover how to solve this if you have a UNIQUE index that you can check against with ON DUPLICATE KEY or INSERT IGNORE. That is not always the case, and as UNIQUE has a length constraint (1000 bytes) you might not be able to change that. For example, I had to work with metadata in WordPress (wp_postmeta).

I finally solved it with two queries:

UPDATE wp_postmeta SET meta_value = ? WHERE meta_key = ? AND post_id = ?;
INSERT INTO wp_postmeta (post_id, meta_key, meta_value) SELECT DISTINCT ?, ?, ? FROM wp_postmeta WHERE NOT EXISTS(SELECT * FROM wp_postmeta WHERE meta_key = ? AND post_id = ?);

Query 1 is a regular UPDATE query with no effect when the dataset in question is not there. Query 2 is an INSERT which depends on a NOT EXISTS, i.e. the INSERT is only executed when the dataset doesn't exist.

查看更多
登录 后发表回答