Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Thursday, August 28, 2008

Variable's Day Out #16: innodb_log_file_size

Properties: 

Applicable To InnoDB
Server Startup Option --innodb_log_file_size=<value>
Scope Global
Dynamic Yes
Possible Values Integer: Range: 1M - 4G
<1M will be adjusted to 1M
Default Value 5M
Category Performance, Maintenance

Description:

This variable defines the size of each log file in a log group. While setting this variable it should be noted that combined size of all log files should be less than 4GB.

InnoDB requires these logs for recovery in case of a crash. So how come the size of these logs effect server performance? As stated in MySQL manual "The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O.", these logs help InnoDB in running more confidently as it knows that even if data is not written to the persistent storage often it can still have it.

Best Value:

A larger value helps you in performance but only up to some point. After a certain value, the performance gain will be minimal or it can be negative. Another issue to be considered is that a value too large will slow down recovery as there will be more and more logs to be scanned. But definitely the default is too small.

My usual recommendation is to set it to 256M or if you feel its big (because maybe you have too many crashes and of course crash recoveries) then 128M. Anything beyond this range should be tested properly and justified.

How to set?

If you just change the size of this variable, MySQL will crib about the changed log file size and start without the InnoDB engine. The safe way of re-setting this value is:
  1. Stop the MySQL server
  2. Backup your data and log files
  3. Delete log files
  4. Set the new value for innodb_log_file_size in my.cnf
  5. Start mysql server
Read More:

Tuesday, July 15, 2008

Variable's Day Out #15: bulk_insert_buffer_size

Properties:

Applicable To MyISAM
Server Startup Option --bulk_insert_buffer_size=<value>
Scope Both
Dynamic Yes
Possible Values Integer:

Range: 0 - 4294967295 (4G)

Default Value 8388608 (8M)
Category Performance

Description:

This cache is used by MyISAM to optimize bulk inserts. This cache is a special tree-like structure. Bulk inserts include statements like LOAD DATA INFILE..., INSERT ... SELECT, INSERT ... VALUES (...), (...), ..., (...)

Bulk inserts are often used to minimize disk writes etc and are very common in applications inserting lots of data. I often use bulk inserting using the following technique: (pseudocode)

  • CREATE TEMPORARY TABLE tmp_table LIKE main_table
  • LOOP:
    • INSERT ROWS INTO TEMPORARY TABLE
  • INSERT INTO main_table SELECT * FROM tmp_table

If you are using any technique for bulk insertion, you should be carefully setting this parameter.

Optimal Value:

This depends on the way your bulk inserts are structured. If you are doing bulk inserts with data being close to or more than 10k, this comes really handy.

Read more:

Hope you enjoyed reading this.

Friday, May 30, 2008

Variable's Day Out #14: log_queries_not_using_indexes

Properties:

Applicable To MySQL Server
Server Startup Option --log-queries-not-using-indexes
Scope Global
Dynamic Yes
Possible Values Boolean
Default False
Category Performance, Monitoring, Best Practices

Description:

If you have slow query logs enabled (with --log-slow-queries), this variable will help interpret all those queries that are not using indexes as slow queries.

Usage:

While designing a MySQL oriented application, I generally design my schema first without introducing any index (only exception being the PRIMARY ones). Get the application ready. Next enable log_queries_not_using_indexes and start index usage testing. Analyze all queries, introduce indexes one by one and you are all set to go. This helps a lot in rewriting queries, figuring out the best possible index etc. Many times I have seen the need for swapping WHERE clause CONDITIONS for making it possible for the query to use an existing index.

For already deployed systems' performance glitches and production outages, this flag do serve as a starting point.

Though as always, usage of a variable is subject to circumstances. For my conditions and usage pattern, this variable comes very handy.

Read more:

Hope you enjoyed reading this.

Tuesday, May 20, 2008

Variable's Day Out #13: binlog_format

Properties:

Applicable To MySQL Server
Introduced In 5.1.5
Server Startup Option --binlog-format=<value>
Scope Both
Dynamic Yes
Possible Values enum(ROW, STATEMENT, MIXED)
Default < 5.1.12: STATEMENT
>= 5.1.12: MIXED
Categories Replication, Performance

Description:

Starting with 5.1.5, MySQL has implemented ROW based replication format which logs the physical changes to individual row changes. This looks like the most optimal way to many users. But it is not always, rather not optimal most of the times. E.g. consider a statement that does bulk insert of thousands of rows. In ROW based logging, there will be those many entries in binlog and otherwise it would have been just one single statement.

STATEMENT based replication, the old good historical way, propagates SQL statements from master to slave and has been working good all those years except for few cases like statements using non deterministic UDFs.

In MIXED format, MySQL uses STATEMENT based replication by default other than a few cases like

  • when USER(), CURRENT_USER() are used
  • when a call to UDF is involved
  • when 2 or more tables with AUTO_INCREMENT columns are updated.

... For full list of all such cases, refer the official documentation.

What about UDF's?

A user defined function or stored procedure is very hard to predict. In such cases, statement based replication can create inconsistencies. Few days back, I saw a case where a table  included in Replicate_Ignore_Table list was propagating statements from a procedure. If one has procedures and such cases, consider using ROW or MIXED mode replication.

What to use?

Though ROW based replication is the safest in case of creating inconsistencies, it may lead to sheer performance degradation thanks to binlog size. On the other hand STATEMENT based replication has drawbacks like with UDFs etc. IMHO, MIXED mode is the best way out unless you have some very special case. The only problem is that there might be more cases that need to be handled by mixed mode than currently being served. We need time's stamp on it. :)

Read more:

 

Hope you enjoyed reading this.

Monday, May 19, 2008

Variable's Day Out #12: innodb_flush_method

Properties:

Applicable To InnoDB on Unix like OS
Server Startup Option --innodb_flush_method=<value>
Scope Global
Dynamic No
Possible Values enum(O_DSYNC, O_DIRECT, <<none>> )
Default Value <<none>>
Category Performance

Description:

This variable changes the way InnoDB open files and flush data to disk and is should be considered as very important for InnoDB performance. By default, InnoDB uses fsync() (without O_DSYNC) to flush both log and data files.

Setting this variable to O_DIRECT will result in InnoDB using O_DIRECT while opening files and fsync() to flush both data and log files. O_DIRECT is useful when an application maintains it's own caching mechanism which is very well true for MySQL/InnoDB. O_DIRECT is the option that should be used in most of the cases as it takes the overhead of double buffering and reduces swap pressure. You should be careful if you are not using a battery backed up RAID cache as it may lead to data loss in case of a crash. Also, the MySQL documentation warns against using O_DIRECT when files are located on a SAN, I don't have any idea on that.

O_DSYNC makes InnoDB to use O_SYNC to open and flush the log files, and uses fsync() to flush data files.

For Windows, the flush method is always async_unbuffered.

Recommendation:

If you are not doing anything unusual like SAN storage etc (which otherwise also you should reconsider before doing), always use O_DIRECT for this. This leads to a significant improvement in InnoDB performance by removing double buffering.

Read More:

 

Hope you enjoyed reading this.

Sunday, May 18, 2008

Variable's Day Out #11: large_pages

Properties:

Applicable To MySQL/Linux
Server Startup Option --large-pages
Scope Global
Dynamic No way
Possible Values True|False flag
Default Value False
Category Performance, Administration

Description:

This option, currently available only for Linux, if set enables the usage of large pages by MySQL. Many operating systems/system architectures support optional memory pages bigger than the default size (4 KB). Usually the large page size is 2 MB, this can be checked in the large_page_size variable in MySQL.

For applications that do a lot of memory accesses, as MySQL can, reduced TLB (Translation Lookaside Buffer) misses lead to better performance. Also having large pages, makes MySQL buffers less susceptible of being swapped out.

When to Use?

Try to keep it enabled, I have never heard of any side effects (In other words if you find any do let me know, I will update the blog post :) ). Other things that can be considered are:

  1. How much of swapping is happening on the system?
  2. What are the other applications running on that system? Though it is a bad idea to run other heavy applications on the same box as MySQL (at least in production), but if you can't get rid of them, using large-pages will make MySQL avoid swapping.

How to configure?

Before large pages can be used on Linux, it is necessary to configure the HugeTLB memory pool.

  1. Check your total memory (MemTotal), huge page size (Hugepagesize) etc and decide how many huge pages you want to run with. This information is available in /proc/meminfo file.
  2. Set the number of huge pages to be used by the system. use sysctl -w vm.nr_hugepages = <value>
  3. Ensure to set memlock for mysql user in /etc/security/limits.conf to allow this user using large memory.
  4. Stop MySQL.
  5. Restart your system.
  6. Start MySQL server with large-pages option.

If nothing goes wrong, you should be all set and going.

Read more:

 

Hope you enjoyed reading this.

Thursday, May 15, 2008

Memcached, but do you need it?

With all due respect to the technology and it's advocates (myself included), after a surge in articles describing the merits of using memcached I'm just pushing a thought breakpoint for developers to think whether they actually need it or not?

Recently, I ran into cases where the developers have decided to use memcached over MySQL style architecture after reading some/many good/nice articles about it without giving a damn to their requirements. I would like to list few things as a checklist for developers to decide on their architecture. There is still no precise answer but sometimes few cases can be just ruled out :).

  1. What is the total size of your data? It might be a possibility that you can keep the data in memory in each node, or MySQL can just keep the whole thing (data+indexes) in a buffer.
  2. How frequently your data is updated? Very frequent updates may lead to low cache hit ratio for memcached data. And refreshing memcached too many times may lead to unnecessary overhead. Remember doing [get,get,set] vs [get].
  3. What is the peak load on your system? Consider if MySQL itself can handle the peak load or otherwise if even memcached cannot handle the peak load with given infrastructure.

I generally ask people a simple question, Why they think they should be using memcached (or something else even)? To shock them, I even ask them "Why they think they should be using MySQL?". And believe me, this is what I believe developers should be asking themselves.

There is only one good argument against this, what if tomorrow you need to suddenly scale or what if your projections need memcached? In such cases, I suggest people to design their data layers in a flexible way, flexible enough to allow things in and out.

Wednesday, April 16, 2008

Variable's Day Out #8: innodb_additional_mem_pool_size

Properties:

Applicable To InnoDB
Server Startup Option --innodb-additional-mem-pool-size=<value>
Scope Global
Dynamic No
Possible Values Integer: Range: 524288 (512K) - 4294967295 (4G)
Default Value 1048576 (1M)
Category Performance

Description:

As per MySQL documentation, this pool is used to store data dictionary information and other internal data structures. If InnoDB runs out of memory on this pool, it starts allocating from OS.

Most of the additional memory pool usage goes to tables in the data dictionary and connections. Normally, with increasing number of tables you may need to allocate more memory to this pool. But as modern OS'es have good and fast memory allocating functions, this variable does not hit performance.

Anyhow, if you need to see how much of additional memory pool is being used, it's available via SHOW ENGINE INNODB STATUS.

   1: SHOW ENGINE INNODB STATUS \G
   2: .
   3: .
   4: ----------------------
   5: BUFFER POOL AND MEMORY
   6: ----------------------
   7: Total memory allocated XXXXXXXXXXXXX; in additional pool allocated XXXXXXX
   8: Dictionary memory allocated XXXXX
   9: .
  10: .
additional pool allocated will show amount of memory allocated in additional memory pool, the usage. And out of usage, Dictionary memory allocated will show amount of memory being used for data dictionary.
Best value: There is hardly any impact of a smaller value for this variable. Still it is better to have a value which around 20% more than what SHOW ENGINE INNODB STATUS shows for additional pool allocated. It's proportional to (number of tables + number of concurrent connections). Read more: Hope you enjoyed reading this post.

Tuesday, April 15, 2008

Variable's Day Out #7: innodb_autoinc_lock_mode

Properties:

Applicable To InnoDB
Introduced In 5.1.22
Server Startup Option --innodb-autoinc-lock-mode=<value>
Scope Global
Dynamic No
Possible Values enum(0,1,2)
Interpretation:
Value Meaning
0 Traditional
1 Consecutive
2 Interleaved
Default Value 1 (consecutive)
Categories Scalability, Performance

Description:

This variable was introduced in 5.1.22 as a result of the [Bug 16979] and comes very handy when stuck with auto_increment scalability issue, also mentioned in my previous post. So what do traditional, consecutive and interleaved mean?

Traditional is "traditional", this takes back InnoDB to pre-innodb_autoinc_lock_mode and a table level AUTO-INC lock is obtained and held until the statement is done. This ensures consecutive auto-increment values by a single statement. Remember, this lock is scoped for a statement and not transaction and hence is not equivalent to serializing transactions as someone raised a question to me recently.

Consecutive, the default lock mode, works in context switching method. For inserts where the number of rows is not known (bulk inserts), ie INSERT ... SELECT, REPLACE ... SELECT, LOAD DATA, it takes a table level AUTO-INC lock. Otherwise for inserts where the number of rows is known in advance (simple inserts), it uses a light weight mutex during the allocation of auto-increment values. The mutex is of course checked for only if no other transaction holds the AUTO-INC lock. However for inserts where user provides auto-increment values for some rows (mixed mode inserts), InnoDB tends to allocate more values and lose them.

Interleaved mode just ensures uniqueness for each generated auto-incremented value. This mode never takes an AUTO-INC lock and multiple statements can keep generating values simultaneously.

How to use?

My overall recommendation is not to change this variable and keep it to default. And if you are having mixed mode insert statements that contradict the usage, better look into them. Otherwise, following are the constraints on usage.

  • Use interleaved only when your tables don't have auto-increment columns. Also if you don't know/don't care if they have, then you have more issues to resolve. :)
  • "mixed mode inserts" can lead to losing values with consecutive mode.
  • It's not safe to use statement based or mixed replication with interleaved mode.
  • Traditional mode has scalability issues, but is safe when used with mixed mode inserts.

Read more:

Hope you enjoyed reading this.

Friday, April 11, 2008

Variable's Day Out #6: Innodb_buffer_pool_reads

Properties:

Applicable To InnoDB
Type Status Variable
Scope GLOBAL | SESSION
Dynamic NA
Possible Values Integer
Default Value NA
Category Performance

Description:

According to the official documentation, this variable defines "The number of logical reads that InnoDB could not satisfy from the buffer pool and had to do a single-page read.".

But we can go a little further to see the what happens in and around this variable. As per the MySQL Internals documentation, a buffer is defined as a memory area that contains copies of pages that ordinarily are in the main data file (read innodb tablespace). And "buffer pool" is simply a set of all the buffers. Now, when InnoDB needs to access a new page it looks first in the buffer pool, in case page isn't there InnoDB reads it from the disk to a new buffer. InnoDB chucks old buffers (basing its decision on LRU algorithm) when it has to make space for a new buffer. This is the time when this variable gets incremented.

Best value:

Without doubt, 0 (zero). But that is only possible if you are not using InnoDB. :)

So, while using InnoDB, you have to see that this variable remains as small as possible when compared to Innodb_buffer_pool_read_requests. In case you are seeing a high value for this, you might like to increase your innodb_buffer_pool_size and/or look at the size of your data/indexes at large.

What else:

I always used to wonder the difference between Innodb_buffer_pool_reads and Innodb_pages_read, but never too seriously until I read Jay Janssen's query about it. And then thought of figuring out myself. As I stated there also "Innodb_pages_read are the pages read from buffer pool and Innodb_buffer_pool_reads are the pages read for the buffer pool", so no confusion.

Read more:

 

Hope you enjoyed reading this.

Tuesday, April 8, 2008

Variable's Day Out #5: innodb_thread_concurrency

Properties:

Applicable To InnoDB
Server Startup Option --innodb_thread_concurrency=<value>
Scope Global
Dynamic Yes
Possible Values Integer:
Range: 0 - 1000
Interpretation:
MySQL Version Value
4.x 0 - sets it to 1
>500 - Infinite Concurrency
< 5.0.19 0 - sets it to 1
>= 20 - Infinite Concurrency
>= 5.0.19 0 - Infinite Concurrency
Default Value
MySQL Version Default Value
< 5.0.8 8 (Finite)
5.0.8 - 5.0.18 20 (Infinite)
5.0.19 - 5.0.20 0 (Infinite)
>= 5.0.21 8 (Finite)
Category Performance

Description:

innodb_thread_concurrency is the variable that limits the number of operating system threads that can run concurrently inside the InnoDB engine. Rest of the threads have to wait in a FIFO queue for execution. Also, threads waiting for locks are not counted in the number of concurrently executing threads.

In the beginning of a thread execution, InnoDB checks the count of already running threads. If it is greater than or equal to the limit set by this variable, then the thread waits for some time and gives a new try. In case the second try also fails, the thread enters a FIFO queue. The sleep time was 50 ms and later on changed to 10 ms, and is managed by innodb_thread_sleep_delay from 5.0.3 onwards. On entering the InnoDB, the thread is given number of "free tickets", number as defined by innodb_concurrency_tickets.

Optimal Setting:

Depends on your environment. The value largely depends on the version of MySQL, number of CPUs, number of Disks, and the kind of load that you are running. As a general principle, you may like to set it to twice the number of CPUs and Disks. And may consider only Disks for Disk IO intensive load and only CPUs for CPU intensive load. Also, if you have greater than or equal to 4 CPUs, you may consider lowering this variable, courtesy [Bug 15815], for un-patched MySQL versions (read the bug for the discussion). For machines with 1 or 2 CPUs, go for the kill, set it to infinite.

Hyperthreading is worse than anything else, it is in many ways different than multi-CPU problems. Because hyperthreading tries to make a single processor work like two while making the instances share a lot of pipelines and cache. You might consider turning hyperthreading off when running MySQL.

In my experience, I have seen a big enough innodb_buffer_pool_size to complement this variable on a multi-CPU machine, of course with the bug resolved. I will be publishing some numbers on it soon.

Read more:

 

Hope this post was helpful to you.

Thursday, April 3, 2008

Variable's Day Out #4: innodb_buffer_pool_size

Properties:

Applicable To InnoDB
Server Startup Option --innodb_buffer_pool_size=<value>
Scope Global
Dynamic No
Possible Values Integer
Range: 1048576 (1MB) - 4294967295 (4GB)
(more for 64 bit machines)
Default Value 8388608 (8MB)
Category Performance

Description:

One of the Peter's "what to tune?" parameters, this innodb buffer pool is a cache for both indexes and data and should be considered as the most important option for InnoDB performance. In case your server is hosting InnoDB heavy database, you can allocate up to 70-80% of the physical memory to this buffer. Anything more than this might cause paging in the operating system. Obviously if your server is not InnoDB heavy and you are using a mix of certain engines, dividing memory can be lot more complicated. This may also be taken as an argument against mixing engine types. :)

Optimal Setting:

As always, there is no magic number and the optimal value for you depends on your requirements. In case your data size is too much, you can always chose to provide as much as possible to this buffer pool. While doing so always remember that InnoDB will eat up some space for data structures related to your buffer pool also. On the other hand, if your projected database size (data+indexes) is small enough to fit in the memory, allocate around 10-15% more than your data-size.

Read more:

Hope this post was helpful to you. Keep posting your comments.

Friday, March 28, 2008

Variable's Day Out #2: key_buffer_size

Properties:

Engine(s) MyISAM
Server Startup Option --key_buffer_size=<value>
Scope Global
Dynamic Yes
Possible Values Integer
Range: 8 - 4294967295 (4 GB)
Default Value 131072 (128 KB)
Category Performance

Description:

This is a global buffer where MySQL caches frequently used blocks of index data for MyISAM data. Maximum allowed size is 4GB on a 32 bit platform. Greater values are permitted for 64-bit platforms beyond MySQL 5.0.52.

Keeping this buffer to an optimal value (neither too low nor too high) contributes heavily to the performance of your MySQL server. As given in the MySQL Documentation, "using a value that is 25% of total memory on a machine that mainly runs MySQL is quite common". As suggested by Peter, "Set up to 30-40% of available memory if you use MyISAM tables exclusively. Right size depends on amount of indexes, data size and workload".

Starting with 25%, one can keep monitoring key cache hit ratio and increase it over time. A key cache hit ratio of more than 95% should be good enough. Also, for preventing oneself from allocating too much memory to this buffer, monitoring key buffer usage should be helpful. I feel one should target towards more than 80% of buffer usage.

If you are not using very few MyISAM tables, still key_buffer can be useful for MySQL's tables and temporary tables as well.

Calculations:

  • key_cache_hit_ratio = (1 - Key_reads/Key_read_requests) * 100
  • key_buffer_usage    = (1 - (Key_blocks_unused * key_cache_block_size)/key_buffer_size) * 100

Read More:

MySQL manual entry on key_buffer_size

What to tune in MySQL server after installation

MyISAM Key Cache - MySQL Manual

 

Hope this post was helpful to you.

Thursday, March 27, 2008

Variable's Day Out #1: delay_key_write

After seeing some very good "of-the-day" series, I thought why should MySQL variables be left behind. So, my contribution in the race - "variable's day out". I won't brand it "of-the-day" series as it calls for too much of dedication and given my track record I shouldn't promise that. Henceforth, instead of calling a variable lucky enough for a day, lets call a day lucky enough for a variable. Though I will try my best to keep the lucky days rolling.

Today's variable is delay_key_write. Properties:

Engines MyISAM
Server Startup Option --delay-key-writes[=<OFF|ON|ALL>]
Scope Global
Dynamic Yes
Possible Values enum('ON', 'OFF', 'ALL')
Default Value ON
Category Performance

Description:

For MyISAM tables, delayed key writes doesn't allow key buffers to be flushed between writes.

Values:

OFF Disables delayed key writing.
ON Enables for the tables created with DELAYED_KEY_WRITE option.
ALL Enables delayed key writing for all MyISAM tables.

Pros:

Enabling delayed key writing is good for heavy load tables (read or write heavy) as it decreases disk access.

Cons:

If the server goes down, the indexes have to be rebuilt.

Read more:

MySQL manual entry on delay_key_write

Hope this post was helpful to you. Keep posting your comments.

MyEye update

Since ages, I was busy with many projects including my marriage. Nice to be back on the blog with an update on MyEye - the project I announced in my very first post. MyEye is an (or is going to be an) open source monitoring tool for MySQL installations with handy knowledge-base and advisories for MySQL DBAs. So, I'm done with the first round of analysis and design for MyEye. As far as it is planned, it's going to be a framework that would provide interface for people to write their own monitoring rules. Meanwhile MyEye web site is also under construction and I will be publishing the documentation soon there. Will be looking forward to your contributions for creating an open knowledge base for MyEye.

Thursday, November 22, 2007

GET_LOCK & FAMILY

Again I have a well neglected and documented feature of MySQL. As we all often need and use locks in MySQL, we tend to forget (or not bother about) MySQL internals and how they cause trouble. For example, try something like LOCK TABLE ... WRITE on an InnoDB table in a transaction and see the same transaction getting timed out while waiting for a lock on one of the rows, Ref: [Bug 5998]. All these problems occur when we have difference in semantics of statement at MySQL and Engine levels. But recently, we figured out this good technique of keeping the logic with ourselves and not relying on MySQL too much. Though even this technique is not fool proof in all the cases.

The secret is: use GET_LOCK function of MySQL. GET_LOCK(str, timeout) function tries to get an exclusive lock with a name (str) using timeout seconds. The return values are tri-state, 1 if the lock was obtained successfully, 0 if there was a timeout and NULL in case of an error. For applications just bothered about getting the lock, following can be used.

   1: SELECT COALESCE(GET_LOCK('lock_string_name', 2), 0)
This is guaranteed to return either 1 or 0 meaning lock taken or otherwise respectively. Locks taken using GET_LOCK function are released in following conditions
  • Lock is released explicitly using RELEASE LOCK function.
  • Issue a new GET_LOCK request.
  • Connection terminates.
Other functions of this family are IS_FREE_LOCK and IS_USED_LOCK which are used to check the lock string status. IS_FREE_LOCK checks if the string is free i.e., returns 1 if the string is free, 0 if it is in use and NULL in case of an error. IS_USED_LOCK checks if the lock is used and returns the connection identifier of the clients that holds the lock otherwise 0. Again NULL is used for error cases.

USEFULLNESS:

  • This is one way of maintaining the mutual exclusivity of a process in a distributed environment while ensuring BCP also. Just have a named lock which will can be obtained only by one process at a time and rest of the processes just timing out and retrying for the lock after a sound sleep.
  • The lock is very light weight and doesn't eat up too many resources at the MySQL server end.
  • Lock is at MySQL level and engines are not involved at all, thus saving you to read bundles of documentation before using something.
CONS:
  • It requires due diligence in the application to ensure that all the conflicting transactions are taken care of. Even one single left out transaction may spoil the whole game. So, I would suggest such techniques are not good for heavily loaded (read lines of code) applications or otherwise you have a good QA team.
  • It actually cannot equate other locks as you can have obtain only one lock for a connection.
MISC:
  • These locks are at connection level and do not respect transactions. Meaning, committing transactions do not have any effect on these locks. I think the fairness of this feature depends on the use case, hence this was put under MISC category :)

Hope this blog was helpful to you. Keep posting your comments.

Wednesday, September 12, 2007

InnoDB Deadlock - Next Key Locking

So you use InnoDB, have indexes on your table, think of row level locking and concurrent queries, feel good and go to sleep. All this while forgetting that even UPDATE and SELECT .... FOR UPDATE statements will (or may) also use the same index for scanning or updating. Then what? You may ask.

Well, InnoDB row level locking works in a somewhat different manner when using indexes. In this case, InnoDB locks index records in shared or exclusive mode when searching or scanning an index. Therefore, as mentioned in MySQL Documentation, the row level locks are actually index record locks.

To complicate matters (or resolve issues) further, the lock is a gap lock. A gap lock refers to a lock that only locks a gap before some index record.

As per the example in MySQL Documentation, lets say we have a query like this.

   1: SELECT * FROM child WHERE id > 100 FOR UPDATE;
And there is an index on id column. The query will lock all the index records where id is greater than 100, like 103, 107, etc. Meanwhile, the query also locks out inserts made in the gaps, say for 101,102, 104, 105, 106, 108 and above. If this is not done, there may be inserts while the transaction is under way.

Now let us consider a little tricky problem. Suppose we have a table table_name with index on colx. We will have two queries running in parallel. For sure, both the queries are working on only one row each and as per the application, they are always different queries.

The queries we have are:
   1: mysql> SELECT * FROM table_name WHERE
   2:     -> colx IS NULL
   3:     -> LIMIT 1
   4:     -> FOR UPDATE;
and
   1: mysql> UPDATE table_name SET
   2:     -> colx = NULL, coly = NULL
   3:     -> WHERE colz = 'something';
The first query starts locking the index records. But locking is not done in a single shot, it is done record by record. Meanwhile, the second query also gets lock on the record that signifies the current value of colx and proceeds towards the NULL value record. Now by the time second query gets to the NULL value record, that gets locked by the first query and first query somehow waits for the lock already grabbed by second one. Huh!

You need to be very (un)lucky to achieve this.

Workarounds
  1. Figure out all such indexes and queries where a deadlock can happen. In most cases, theoretical investigation may lead to unnecessary paranoia, so its better to have a regression test on your system and monitor for deadlocks. If you can, part away from those indexes.
  2. If you think you cannot live without your indexes, and the queries involved in deadlocks are mutually exclusive, you can go ahead with enabling innodb_locks_unsafe_for_binlog without citing this blog as a reference. It's all your choice.
  3. The tests I have conducted have been done on MySQL 4.1.x. There is a claim that from MySQL 5.0.2, UPDATE and DELETE only locks rows that are going to be affected. So this should reduce the probability of deadlocks. I haven't tested it though.
Hope this blog was helpful to you. Keep posting your comments.

Sunday, July 22, 2007

OLAP Paradise - WITH ROLLUP

Though this is an age old concept, but after seeing the ignorance factor of many regarding this, I thought of writing a bit about it. ROLLUP is a GROUP BY modifier that adds extra rows to the output which contain summary data. At each level of aggregation the aggregated column is filled with a NULL value.

Let's see a small example. I know the values are too small for the data presented, but just to keep it readable.

   1: mysql> CREATE TABLE `rollup_1` (
   2:   `url` varchar(255) DEFAULT NULL,
   3:   `year` int(11) DEFAULT NULL,
   4:   `country` char(2) DEFAULT NULL,
   5:   `visit` int(11) DEFAULT NULL
   6: ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
   7: Query OK, 0 rows affected (0.01 sec)
   8:  
   9: mysql> SELECT * FROM rollup_1;
  10: +----------------------+------+---------+-------+
  11: | url                  | year | country | visit |
  12: +----------------------+------+---------+-------+
  13: | http://www.yahoo.com | 2005 | US      | 12000 |
  14: | http://www.yahoo.com | 2005 | IN      | 10000 |
  15: | http://www.yahoo.com | 2006 | US      | 16000 |
  16: | http://www.yahoo.com | 2006 | IN      | 10700 |
  17: | http://www.yahoo.com | 2007 | US      | 21000 |
  18: | http://www.yahoo.com | 2007 | IN      | 13200 |
  19: +----------------------+------+---------+-------+
  20: 6 rows in set (0.00 sec)
  21:  
  22: mysql> SELECT IFNULL(url, 'TOTAL') AS url, SUM(visit) AS visit FROM rollup_1 GROUP BY url WITH ROLLUP;
  23: +----------------------+-------+
  24: | url                  | visit |
  25: +----------------------+-------+
  26: | http://www.yahoo.com | 82900 |
  27: | TOTAL                | 82900 |
  28: +----------------------+-------+
  29: 2 rows in set, 1 warning (0.00 sec)
  30:  
Complete analysis:
   1: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url,
   2:     -> IFNULL(year, 'ALL_YEARS') AS year,
   3:     -> IFNULL(country, 'ALL_COUNTRIES') AS country,
   4:     -> SUM(visit) FROM rollup_1
   5:     -> GROUP BY url, year, country WITH ROLLUP;
   6: +----------------------+-----------+---------------+------------+
   7: | url                  | year      | country       | SUM(visit) |
   8: +----------------------+-----------+---------------+------------+
   9: | http://www.yahoo.com | 2005      | IN            |      10000 |
  10: | http://www.yahoo.com | 2005      | US            |      12000 |
  11: | http://www.yahoo.com | 2005      | ALL_COUNTRIES |      22000 |
  12: | http://www.yahoo.com | 2006      | IN            |      10700 |
  13: | http://www.yahoo.com | 2006      | US            |      16000 |
  14: | http://www.yahoo.com | 2006      | ALL_COUNTRIES |      26700 |
  15: | http://www.yahoo.com | 2007      | IN            |      13200 |
  16: | http://www.yahoo.com | 2007      | US            |      21000 |
  17: | http://www.yahoo.com | 2007      | ALL_COUNTRIES |      34200 |
  18: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES |      82900 |
  19: | ALL_URLS             | ALL_YEARS | ALL_COUNTRIES |      82900 |
  20: +----------------------+-----------+---------------+------------+
  21: 11 rows in set, 3 warnings (0.00 sec)
There is a small Gotcha while using ROLLUP. As you can see, my table definition does not put a NOT NULL constraint on any of the fields. Though you might have guessed what I'm pointing to, let's see this example.
   1: mysql> INSERT INTO rollup_1 values ('http://www.yahoo.com', 2007, null, 12000);
   2: Query OK, 1 row affected (0.00 sec)
   3:  
   4: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url, 
   5:     -> IFNULL(year, 'ALL_YEARS') AS year, 
   6:     -> IFNULL(country, 'ALL_COUNTRIES') AS country, 
   7:     -> SUM(visit) FROM rollup_1 
   8:     -> GROUP BY url, year, country WITH ROLLUP;
   9: +----------------------+-----------+---------------+------------+
  10: | url                  | year      | country       | SUM(visit) |
  11: +----------------------+-----------+---------------+------------+
  12: | http://www.yahoo.com | 2005      | IN            |      10000 |
  13: | http://www.yahoo.com | 2005      | US            |      12000 |
  14: | http://www.yahoo.com | 2005      | ALL_COUNTRIES |      22000 |
  15: | http://www.yahoo.com | 2006      | IN            |      10700 |
  16: | http://www.yahoo.com | 2006      | US            |      16000 |
  17: | http://www.yahoo.com | 2006      | ALL_COUNTRIES |      26700 |
  18: | http://www.yahoo.com | 2007      | ALL_COUNTRIES |      12000 |
  19: | http://www.yahoo.com | 2007      | IN            |      13200 |
  20: | http://www.yahoo.com | 2007      | US            |      21000 |
  21: | http://www.yahoo.com | 2007      | ALL_COUNTRIES |      46200 |
  22: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES |      94900 |
  23: | ALL_URLS             | ALL_YEARS | ALL_COUNTRIES |      94900 |
  24: +----------------------+-----------+---------------+------------+
  25: 12 rows in set, 3 warnings (0.00 sec)
As you can see in the output, there are two rows claiming to provide data for url='http://www.yahoo.com' and year=2007, question is which one is which one? Few good answers are:
  1. One that appears late in the result set.
  2. One that has a higher value for SUM(visit) field. This is arguable in general as there might be negative values, for our example we can say that's impossible.
The best answer is of course this. For using ROLLUP with data like this, please make sure that you have fields marked as NOT NULL and instead specify another DEFAULT like 'not set' or something else.

While using ROLLUP, you may further like to consider few points.

  • ROLLUP and ORDER BY are mutually exclusive, they cannot be used in the same statement.
  • If you are using LIMIT clause along with ROLLUP, remember LIMIT clause is applied after ROLLUP, so it may include the aggregated rows.
  • In MySQL documentation, it is mentioned that "Because the NULL values in the super-aggregate rows are placed into the result set at such a late stage in query processing, you cannot test them as NULL values within the query itself.". You can see in the above queries, we are actually testing this value to be replaced by TOTAL_* string. Also if your data itself does not have NULL values in it, it works wonders. Look at the output below, I have removed the row inserted above.
   1: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url, 
   2:     -> IFNULL(year, 'ALL_YEARS') AS year, 
   3:     -> IFNULL(country, 'ALL_COUNTRIES') AS country, 
   4:     -> SUM(visit) FROM rollup_1 
   5:     -> GROUP BY url, year, country WITH ROLLUP HAVING country is null;
   6: +----------------------+-----------+---------------+------------+
   7: | url                  | year      | country       | SUM(visit) |
   8: +----------------------+-----------+---------------+------------+
   9: | http://www.yahoo.com | 2005      | ALL_COUNTRIES |      22000 |
  10: | http://www.yahoo.com | 2006      | ALL_COUNTRIES |      26700 |
  11: | http://www.yahoo.com | 2007      | ALL_COUNTRIES |      34200 |
  12: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES |      82900 |
  13: | ALL_URLS             | ALL_YEARS | ALL_COUNTRIES |      82900 |
  14: +----------------------+-----------+---------------+------------+
  15: 5 rows in set, 4 warnings (0.00 sec)
Hope this blog was helpful to you. Keep posting your comments.