How to 'insert if not exists' in MySQL?

I started by googling, and found this article which talks about mutex tables.

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?

Does a unique constraint on a field guarantee the insert will fail if it’s already there?

It seems that with merely a constraint, when I issue the insert via php, the script croaks.




Answer

use INSERT IGNORE INTO table

see http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html

there’s also INSERT … ON DUPLICATE KEY UPDATE syntax, you can find explanations on dev.mysql.com


Post from bogdan.org.ua according to Google’s webcache :

18th October 2007 To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality. There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE. Imagine we have a table: ``` CREATE TABLE `transcripts` ( `ensembl_transcript_id` varchar(20) NOT NULL, `transcript_chrom_start` int(10) unsigned NOT NULL, `transcript_chrom_end` int(10) unsigned NOT NULL, PRIMARY KEY (`ensembl_transcript_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ``` Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things: 1) repeated executions of the pipeline will not destroy our database, and 2) repeated executions will not die due to ‘duplicate primary key’ errors. Method 1: using REPLACE It’s very simple: ``` REPLACE INTO `transcripts` SET `ensembl_transcript_id` = ‘ENSORGT00000000001′, `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678; ``` If the record exists, it will be overwritten; if it does not yet exist, it will be created. However, using this method isn’t efficient for our case: we do not need to overwrite existing records, it’s fine just to skip them. Method 2: using INSERT IGNORE Also very simple: ``` INSERT IGNORE INTO `transcripts` SET `ensembl_transcript_id` = ‘ENSORGT00000000001′, `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678; ``` Here, if the ‘ensembl_transcript_id’ is already present in the database, it will be silently skipped (ignored). (To be more precise, here’s a quote from MySQL reference manual: “If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted.”.) If the record doesn’t yet exist, it will be created. This second method has several potential weaknesses, including non-abortion of the query in case any other problem occurs (see the manual). Thus it should be used if previously tested without the IGNORE keyword. There is one more option: to use INSERT … ON DUPLICATE KEY UPDATE syntax, and in the UPDATE part just do nothing do some meaningless (empty) operation, like calculating 0+0 (Geoffray suggests doing the id=id assignment for the MySQL optimization engine to ignore this operation). Advantage of this method is that it only ignores duplicate key events, and still aborts on other errors. As a final notice: this post was inspired by Xaprb. I’d also advise to consult his other post on writing flexible SQL queries.