Thursday, July 12, 2007

InnoDB Tablespace Size

Unlike MyISAM where data for individual tables is stored in their respective files, InnoDB stores data in a tablespace. By default, there is one single tablespace and data of all the databases is stored in one file. This file has data dictionary, tables, as well as indexes in it. There is a global parameter innodb_data_file_path that defines this tablespace file. It has a syntax like ibdata1:256M:autoextend, this means at the beginning a file of size 256 MB will be created and then whenever the data size exceeds this, the file will be auto-extended. The innodb_autoextend_increment variable defines in MB's that by how much each increment should be.

Let's see how well can we play around:

  1. Inserts: Suppose you have too many inserts and InnoDB is extending the file too frequently. It makes sense to increase the value of innodb_autoextend_increment. Say we increase it to 16MB, then obviously the number of attempts to autoextend tablespace comes down by a factor of 2, hence performance. But beware before you take it too easy and increase the value too much. There is a big trap, we will come to it shortly.
  2. Deletes: Here is the trap. You have a 10 GB tablespace (after too many autoextends), delete some 5 GB data (data + indexes) and think now the tablespace is 5 GB. Wrong, InnoDB doesn't have the notion of giving back space to the file system. Though, it will make sure to use the freed up space for further inserts. So, this method directly cannot be used to free disk space. So, in case you have data which you can get rid of, get rid of quickly before the next autoextend is done. One thing that can be done to reclaim space is to use OPTIMIZE TABLE frequently enough on tables that have high volume of inserts and deletes. But again remember, MySQL locks a table during the time OPTIMIZE TABLE is running. Another Gotcha, right? OPTIMIZE TABLE does several other things for which it makes sense to run it, though not that frequently. I will be posting a blog soon on it. 
  3. Separate Files per Table: InnoDB provides this option where data (data + indexes) for each table can be stored in a separate file through a global variable innodb_file_per_table. Though still a shared tablespace will be created for storing the likes of data dictionary et al. But still this approach makes sense as having data in small chunks (separate files) will improve the scope of managing them well and may increase performance in general.
  4. Fixed Tablespace size: One way to work around with the tablespace file size problems is to fix the tablespace size (remove autoextend) to an extrapolated value. So, when you hit the limit, you know it is time to cleanup. This is not that viable with all the applications, as extrapolation is not always possible. And also it increases the complexity of the application, which then needs to take care of all such error conditions and not lose any data.

So, where does this end? You need to figure out what your data is, how critical it is, what all you want to do with it, what all you want your data to do. Then take some of the following steps.

  1. Move to MyISAM: For all the tables (or even databases), for which you feel data is not that critical to have transactions et al, move them to MyISAM. So, for the problem we can't solve completely, we destroy the problem.
  2. Separate Tablespace: Its a lot easier to maintain 10 small problems than a single big one.
  3. Delete data/OPTIMIZE TABLE: Figure out how soon you can get rid of data. You actually don't need to delete data as it is. Transfer it to a MyISAM table, compress the file and archive it somewhere else and then delete it from the main table. Likewise there are many ways to do it. Run OPTIMIZE TABLE frequently enough so that it doesn't bother your reads and writes too much and also it doesn't take too much time to run.

Hope this blog was helpful to you. Keep posting your comments.


Abu said...

Thanks bro,
This gave me a good understanding of the limitation of InnoDB in this respect.

KurasakiIchigo said...

Really helpful. Keep bloggin.

Manuel said...

Hi, I know this post is quite old, but I need to know how MySql is structured. I mean, in Oracle there is a redo buffer, an index area and other stuff in memory. And also, there are tablespaces and indexes in hard drives. I need to know if there's something like that in MySql.

Anurag said...

This is fantastic, Great article really helped me out

Greg said...

This is fantastic, Great article really helped me out