It's very easy to have an auto_increment column in a table and maintain primary keys. Many people think so, and also they think that MySQL will do the best job for them. With InnoDB, the case is a little different as supposed to be. Basic assumption is that since InnoDB works with row-level locking, combining it with auto_increment should ward off worries. But it is just the beginning.
Let me clarify how auto_increment works. InnoDB maintains the auto_increment value in memory. Good approach. When an insert statement requests the auto_increment value, InnoDB takes the stored value, increment it by one and return. The value in memory is not incremented at this time. This is not bad, because who knows how many rows the insert statement has.
But, this means that InnoDB cannot provide auto_increment values to other threads until this finishes, hence we cannot have concurrent inserts even in InnoDB when using an auto_increment value. This is assured by a special AUTO-INC lock by InnoDB, which is an exclusive table level lock. We will see how this can affect your application and some workarounds.
Application may suffer
If you have concurrent inserts into an InnoDB table, start worrying. Single row inserts won't affect the performance noticeably. But, if you are doing bulk inserts and your inserts are taking enough time, one of your insert threads may exit because of lock wait timeout. Also, this will cause problems for other components of your application (even if they are read only) as these insert statements will keep connected and you may hit your max_connections.
- As you might think, moving to MyISAM won't help. MyISAM by default takes a table lock while writing.
- Do some magic and make sure your inserts do not execute concurrently. This is also not much possible in the case of web applications, where one doesn't have the control over the inserts, but neither web applications do bulk inserts. This can be managed where one is using a cron job to do bulk inserts, say from a log file.
- Get rid of auto_increment. If your application actually needs to do bulk inserts from multiple sources at the same time and the above solution doesn't make sense, figure out your own primary key. Something like a combination of MD5 checksum over data combined with the length of the data may be a good option in most cases, but do not rely on it. Also, remember a multi-column primary key is not always a good option.
Hope this blog was helpful to you. Keep posting your comments.