<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1508669603650457962</id><updated>2012-03-15T18:16:02.184+01:00</updated><title type='text'>Øystein on MySQL Optimizer</title><subtitle type='html'>and other things he thinks is of interest ...</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://oysteing.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://oysteing.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Øystein</name><uri>http://www.blogger.com/profile/11287151295676613909</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://1.bp.blogspot.com/-6jY-m3jYD8Q/TaMIWUFidPI/AAAAAAAAAAQ/AYrUPQePJQQ/s220/bilde.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>3</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1508669603650457962.post-1806241750750192314</id><published>2011-10-03T18:13:00.000+02:00</published><updated>2011-10-05T15:31:09.698+02:00</updated><title type='text'>Bacthed Key Access Speeds Up Disk-Bound Join Queries</title><content type='html'>&lt;style type="text/css"&gt;&lt;!--div.codebox {height: 100%;width: 100%;border: 1px solid;background-color: #EEEEEE;padding: 4px;}--&gt;&lt;/style&gt; A new feature in &lt;a href="http://blogs.oracle.com/MySQL/entry/more_early_access_features_in"&gt;MySQL 5.6.3 Development Milestone Release&lt;/a&gt; is &lt;a href="http://dev.mysql.com/doc/refman/5.6/en/bka-optimization.html"&gt;Batched Key Access (BKA)&lt;/a&gt;. BKA can be applied when an index lookup can be used to execute a join query. One example of such a query is:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; SELECT * FROM customer JOIN orders ON c_custkey = o_custkey;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;Given that the &lt;tt&gt;customer&lt;/tt&gt; table is significantly smaller than the &lt;tt&gt;orders&lt;/tt&gt; table, and assuming that we have an index on the &lt;tt&gt;o_custkey&lt;/tt&gt; column of &lt;tt&gt;orders&lt;/tt&gt;, MySQL have traditionally executed this join as follows: Scan the entire &lt;tt&gt;customer&lt;/tt&gt; table, and for each row in the &lt;tt&gt;customer&lt;/tt&gt; table, do an index look-up into the &lt;tt&gt;orders&lt;/tt&gt; table to find matching rows.  This strategy is know as &lt;i&gt;Index Nested Loops Join&lt;/i&gt;, or as MySQL EXPLAIN puts it:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+-------+&lt;br /&gt;| id | select_type | table    | type | possible_keys | key         | key_len | ref                     | rows   | Extra |&lt;br /&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+-------+&lt;br /&gt;|  1 | SIMPLE      | customer | ALL  | PRIMARY       | NULL        | NULL    | NULL                    | 150000 |       |&lt;br /&gt;|  1 | SIMPLE      | orders   | ref  | i_o_custkey   | i_o_custkey | 5       | dbt3.customer.c_custkey |      7 |       |&lt;br /&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+-------+&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;If BKA is applied, MySQL will instead buffer keys to be used for index look-up in the &lt;i&gt;join buffer&lt;/i&gt;, and each time the join buffer is full, it will do a batched look-up on the index.  That is, the whole set of keys from the join buffer is sent to the storage engine in one batch. For this purpose, MySQL Server uses the &lt;a href="http://dev.mysql.com/doc/refman/5.6/en/mrr-optimization.html"&gt;Multi-Range Read interface&lt;/a&gt; of the Storage Engine API.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Multi-Range Read (MRR)&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The advantage of requesting a batch of rows from the storage engine, is that the storage engine may be able to deliver the rows more efficiently that way. The MRR interface has for some time been &lt;a href="http://www.clusterdb.com/mysql-cluster/mysql-cluster-multi-range-read-using-ndb-api/"&gt;used in MySQL Cluster&lt;/a&gt; since it reduces communication between nodes by making it possible to request more than one row per round-trip. &lt;br /&gt;&lt;br /&gt;A new feature in &lt;a href="http://blogs.oracle.com/mysql/2011/04/top_features_in_mysql_562_development_milestone_release.html"&gt;MySQL 5.6.2&lt;/a&gt; was Disk-Sweep Multi-Range Read (DS-MRR) optimizations for both InnoDB and MyISAM.  When DS-MRR is applied, the storage engine will access the data rows in the order given by the base table; instead of in the order given by the index. For a disk-bound query this has two advantages:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;All interesting rows on the same disk page will be accessed together.  Hence, one do not risk that a page has been removed from the buffer pool between two accesses to the page.&lt;/li&gt;&lt;li&gt;Even if modern disks/file systems do not give any guarantees, accessing the pages in table order, will normally give close to sequential access to the disk.  As we know, that should give much better performance compared to random access.  (If multiple sessions access the disks simultaneously, the advantage of sequential access within a session will of course be less significant.)&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Turning On Batched Key Access&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;While BKA is advantageous in a disk-bound system, it may actually reduce performance in a CPU-bound setting. (More on this below.)  Since the MySQL Optimizer currently has no information on whether a table resides in memory or on disk, BKA is off by default.&lt;br /&gt;&lt;br /&gt;You can turn on BKA by setting an &lt;tt&gt;optimizer_switch&lt;/tt&gt; flag.  Since BKA depends on MRR, the MRR flag also needs to be on. (This is default.)  In addition, since the cost model for MRR is currently way too conservative, in order to use MRR, cost-based MRR optimization needs to be turned off.  In other words, the following settings must be done in order for the Optimizer to consider BKA:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; SET optimizer_switch='mrr=on,mrr_cost_based=off,batched_key_access=on';&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;With BKA turned on, EXPLAIN for the query presented earlier, will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+----------------------------------------+&lt;br /&gt;| id | select_type | table    | type | possible_keys | key         | key_len | ref                     | rows   | Extra                                  |&lt;br /&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+----------------------------------------+&lt;br /&gt;|  1 | SIMPLE      | customer | ALL  | PRIMARY       | NULL        | NULL    | NULL                    | 150000 |                                        |&lt;br /&gt;|  1 | SIMPLE      | orders   | ref  | i_o_custkey   | i_o_custkey | 5       | dbt3.customer.c_custkey |      7 | Using join buffer (Batched Key Access) |&lt;br /&gt;+----+-------------+----------+------+---------------+-------------+---------+-------------------------+--------+----------------------------------------+&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Performance Impact of BKA on DBT-3 Queries&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The chart below shows query execution times, with and without BKA, for all the queries of the &lt;a href="http://osdldbt.sourceforge.net/#dbt3"&gt;DBT-3 benchmark&lt;/a&gt; where BKA can be used.  The size of the database is DBT-3 scale 1 (4 GB), and in order to get disk bound queries, the size of the InnoDB buffer pool is just 50 MB. We see that, for most of the queries, the run time with BKA is around half the run time without BKA .&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-U408wL-cX8M/TomseziTiPI/AAAAAAAAABg/FcrBheu4Brs/s1600/bka-dbt3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="236" src="http://2.bp.blogspot.com/-U408wL-cX8M/TomseziTiPI/AAAAAAAAABg/FcrBheu4Brs/s400/bka-dbt3.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Note that Query 17 is an example of a query where it is not beneficial to use BKA even when the query is disk-bound.  In that particular case, locality of accesses are very good without BKA, and applying BKA actually spreads accesses to related rows in time, decreasing the buffer cache hit ratio.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Effects of Increasing the Join Buffer Size&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As mention above, the size of the batches used by BKA is determined by the size of the join buffer.  It follows from the discussion above that the larger the batch, the more efficiently the storage engine may arrange its disk accesses.  The results presented above were achieved with a default setting of &lt;tt&gt;join_buffer_size&lt;/tt&gt; (128 kB).  We will now look closer at what happens if we increase the size of the join buffer.  For this purpose, we will use Query 13 (&lt;i&gt;Customer Distribution Query&lt;/i&gt;) from the DBT-3 benchmark:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;SELECT c_count, COUNT(*) AS custdist&lt;br /&gt;FROM (&lt;br /&gt;      SELECT c_custkey, COUNT(o_orderkey) AS c_count&lt;br /&gt;      FROM customer LEFT OUTER JOIN orders ON&lt;br /&gt;        c_custkey = o_custkey AND o_comment NOT LIKE '%express%requests%'&lt;br /&gt;      GROUP BY c_custkey&lt;br /&gt;     ) AS c_orders &lt;br /&gt;GROUP BY c_count&lt;br /&gt;ORDER BY custdist DESC, c_count DESC;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;The heart of this query is the join between &lt;tt&gt;customer&lt;/tt&gt; table and &lt;tt&gt;orders&lt;/tt&gt; table:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;+----+-------------+------------+-------+---------------+---------------+---------+-------------------------+--------+-----------------------------------------------------+&lt;br /&gt;| id | select_type | table      | type  | possible_keys | key           | key_len | ref                     | rows   | Extra                                               |&lt;br /&gt;+----+-------------+------------+-------+---------------+---------------+---------+-------------------------+--------+-----------------------------------------------------+&lt;br /&gt;|  1 | PRIMARY     | &amp;lt;derived2&amp;gt; | ALL   | NULL          | NULL          | NULL    | NULL                    | 1050000 | Using temporary; Using filesort                     |&lt;br /&gt;|  2 | DERIVED     | customer   | index | NULL          | i_c_nationkey | 5       | NULL                    |  150000 | Using index; Using temporary; Using filesort        |&lt;br /&gt;|  2 | DERIVED     | orders     | ref   | i_o_custkey   | i_o_custkey   | 5       | dbt3.customer.c_custkey |       7 | Using where; Using join buffer (Batched Key Access) |&lt;br /&gt;+----+-------------+------------+-------+---------------+---------------+---------+-------------------------+--------+-----------------------------------------------------+&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;Using BKA for this query, MySQL will fill the join buffer with customer keys and send a batch of keys to the storage engine. The storage engine will find the primary keys for the requested rows in the &lt;tt&gt;i_o_custkey&lt;/tt&gt; index, sort the primary keys, and access the &lt;tt&gt;orders&lt;/tt&gt; table in primary key order. In other words, there will be one pass over the &lt;tt&gt;orders&lt;/tt&gt; table for each batch.  The graph below shows the effect of increasing the join buffer on Query 13. (Note the logarithmic scale of the y-axis.)  The query execution time goes down from 20 minutes with a default join buffer to less than 10 secs with a 32 MB join buffer!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-20uvXuiAPLk/Tom7NhCV6LI/AAAAAAAAABw/4P1FdafunIk/s1600/bka-slide2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="225" src="http://3.bp.blogspot.com/-20uvXuiAPLk/Tom7NhCV6LI/AAAAAAAAABw/4P1FdafunIk/s400/bka-slide2.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;BKA Performance When Queries are CPU-bound&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As discussed above, BKA is designed to speed up disk-bound queries.  Now we will look at what happens if BKA is used in a system where all the data in the database reside in main memory.  The chart below presents the results of running the same DBT-3 queries with a InnoDB buffer pool of 5 GB.  While the difference is not very significant for most queries, we see there are a few queries where the impact of BKA is quite negative.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-kabmXHPBMRc/TonCY6EJ46I/AAAAAAAAAB4/Nueaes-isAk/s1600/bka-mem.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="236" src="http://4.bp.blogspot.com/-kabmXHPBMRc/TonCY6EJ46I/AAAAAAAAAB4/Nueaes-isAk/s400/bka-mem.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The negative impact of BKA is not just the overhead of sorting the keys before accessing the table.  If you look at the execution plan for QUERY 13 as presented above, it shows that the result of the join needs to be sorted in order to perform the group by operation of the subquery.  This would not have been necessary, had not BKA been used.  Without BKA, the result from the join would have been delivered in customer key order and grouping could be done without sorting.  Hence, in this experiment, the execution time for Query 13 increased from 3.6 to 4.8 seconds when using BKA.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Use with Care&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As I have shown, join query execution can benefit significantly from the use of Batched Key Access.  However, there is also examples of queries where BKA do more harm than good.  Hence, it should be used with care. My advice is to study the BKA performance for your typical queries and to only turn on BKA for queries that are known to benefit from it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1508669603650457962-1806241750750192314?l=oysteing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://oysteing.blogspot.com/feeds/1806241750750192314/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://oysteing.blogspot.com/2011/10/bacthed-key-access-speeds-up-disk-bound.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/1806241750750192314'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/1806241750750192314'/><link rel='alternate' type='text/html' href='http://oysteing.blogspot.com/2011/10/bacthed-key-access-speeds-up-disk-bound.html' title='Bacthed Key Access Speeds Up Disk-Bound Join Queries'/><author><name>Øystein</name><uri>http://www.blogger.com/profile/11287151295676613909</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://1.bp.blogspot.com/-6jY-m3jYD8Q/TaMIWUFidPI/AAAAAAAAAAQ/AYrUPQePJQQ/s220/bilde.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-U408wL-cX8M/TomseziTiPI/AAAAAAAAABg/FcrBheu4Brs/s72-c/bka-dbt3.jpg' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1508669603650457962.post-2126423813645872715</id><published>2011-05-05T13:10:00.000+02:00</published><updated>2011-10-05T15:29:15.423+02:00</updated><title type='text'>InnoDB Persistent Statistics Save the Day</title><content type='html'>&lt;style type="text/css"&gt;&lt;!--div.codebox {height: 100%;width: 100%;border: 1px solid;background-color: #EEEEEE;padding: 4px;}--&gt;&lt;/style&gt; In my previous &lt;a href="http://oysteing.blogspot.com/2011/04/more-stable-query-execution-time-by.html"&gt;blog posting&lt;/a&gt;, I explained how I was able to get more stable query execution times by increasing the amount of sampling used to by InnoDB to calculate statistics.  However, for my example query, Query 8 of the DBT-3 benchmark, the MySQL Optimizer still toggled between three different indexes to use when accessing one of the 8 tables. &lt;br /&gt;&lt;br /&gt;I decided to try out &lt;a href="http://blogs.innodb.com/wp/2011/04/innodb-persistent-statistics-at-last/"&gt;InnoDB Persistent Statistics&lt;/a&gt; that is one of the new features in the recent &lt;a href="http://blogs.oracle.com/mysql/2011/04/top_features_in_mysql_562_development_milestone_release.html"&gt;MySQL 5.6.2 Development Milestone Release&lt;/a&gt;. According to the advertisement, this should give both more accurate and more stable statistics. The improved accuracy is achieved by using a more precise sampling algorithm, while the increased stability is achieved by giving the user full control over when statistics are recalculated. That is, the persistent statistics are not recalculated automatically.&lt;br /&gt;&lt;br /&gt;In order to activate persistent statistics, you first need to run a script to create the tables to be used to store the statistics. Then, you can enable the feature by setting the system variable &lt;tt&gt;innodb_analyze_is_persistent&lt;/tt&gt;. For the details, see the &lt;a href="http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-persistent-stats"&gt;MySQL 5.6 Manual&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;More Accurate Statistics&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To investigate whether the new sampling algorithm gives more accurate statistics, I looked at the average and variance in the estimated cardinality of indexed columns over 100 runs of ANALYZE on the DBT-3 tables. The average of the estimates did not change much as it is actually pretty spot-on with the old sampling algorithm.  It is the great variance that causes the observed inaccuracies.&lt;br /&gt;&lt;br /&gt;The below chart shows the &lt;a href="http://en.wikipedia.org/wiki/Coefficient_of_variation"&gt;coefficient of variation&lt;/a&gt; for the estimated cardinality. (Columns that are listed more than once, are part of multiple indexes. If a bar in the bar chart is not visible, it means that there was no or very little variance between runs for this column.)  The bar chart clearly shows that for most indexes, the sampling algorithm for persistent statistics gives less variance than the old sampling algorithm used with transient statistics. (The number of sampled index pages was 100 in both cases.)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-XMJQoCj5FtQ/TcE-9K1tbHI/AAAAAAAAAAw/so-You4bMMM/s1600/stat100_cv.jpg" imageanchor="1"&gt;&lt;img border="0" height="351" src="http://1.bp.blogspot.com/-XMJQoCj5FtQ/TcE-9K1tbHI/AAAAAAAAAAw/so-You4bMMM/s400/stat100_cv.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;More Stable Statistics&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As mentioned above, the persistent statistics will be more stable because it is not updated automatically like the old transient statistics.  This way, for our performance regression test, we can run ANALYZE once for each table, and all later runs on this database will use the same statistics.  Another advantage of being able to control when statistics are recalculated, is that it can be scheduled at times when one can afford to use a higher sampling rate.  The disadvantage is that it becomes the burden of the DBA to make sure to regularly initiate a recalculation of statistics to prevent that the statistics become outdated.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;InnoDB Statistics Tables&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The InnoDB statistics tables are ordinary tables that are created in a database called &lt;tt&gt;innodb&lt;/tt&gt;. There are two tables: &lt;tt&gt;table_stats&lt;/tt&gt; and &lt;tt&gt;index_stats&lt;/tt&gt; that contain per-table and per-index statistics, respectively:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; describe innodb.table_stats;&lt;br /&gt;+--------------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;| Field                    | Type                | Null | Key | Default           | Extra                       |&lt;br /&gt;+--------------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;| database_name            | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| table_name               | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| stats_timestamp          | timestamp           | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |&lt;br /&gt;| n_rows                   | bigint(20) unsigned | NO   |     | NULL              |                             |&lt;br /&gt;| clustered_index_size     | bigint(20) unsigned | NO   |     | NULL              |                             |&lt;br /&gt;| sum_of_other_index_sizes | bigint(20) unsigned | NO   |     | NULL              |                             |&lt;br /&gt;+--------------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;6 rows in set (0.01 sec)&lt;br /&gt;&lt;br /&gt;mysql&amp;gt; describe innodb.index_stats;&lt;br /&gt;+------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;| Field            | Type                | Null | Key | Default           | Extra                       |&lt;br /&gt;+------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;| database_name    | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| table_name       | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| index_name       | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| stat_timestamp   | timestamp           | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |&lt;br /&gt;| stat_name        | varchar(64)         | NO   | PRI | NULL              |                             |&lt;br /&gt;| stat_value       | bigint(20) unsigned | NO   |     | NULL              |                             |&lt;br /&gt;| sample_size      | bigint(20) unsigned | YES  |     | NULL              |                             |&lt;br /&gt;| stat_description | varchar(1024)       | NO   |     | NULL              |                             |&lt;br /&gt;+------------------+---------------------+------+-----+-------------------+-----------------------------+&lt;br /&gt;8 rows in set (0.00 sec)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;&lt;tt&gt;table_stats&lt;/tt&gt; contains one row per table:&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; select * from innodb.table_stats where table_name='customer';&lt;br /&gt;+---------------+------------+---------------------+--------+----------------------+--------------------------+&lt;br /&gt;| database_name | table_name | stats_timestamp     | n_rows | clustered_index_size | sum_of_other_index_sizes |&lt;br /&gt;+---------------+------------+---------------------+--------+----------------------+--------------------------+&lt;br /&gt;| dbt3          | customer   | 2011-04-01 20:53:30 | 149911 |                 1764 |                      225 |&lt;br /&gt;+---------------+------------+---------------------+--------+----------------------+--------------------------+&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;&lt;tt&gt;index_stats&lt;/tt&gt; contains several rows per index, and the column &lt;tt&gt;stat_name&lt;/tt&gt; identifies the type of statistics as described by the content of the &lt;tt&gt;stat_description&lt;/tt&gt; column.  The &lt;tt&gt;n_diff_pfx_&lt;/tt&gt; rows contain the cardinality statistics:  For rows with &lt;tt&gt;stat_name = 'n_diff_pfx_01'&lt;/tt&gt;, &lt;tt&gt;stat_value&lt;/tt&gt; contains the number of distinct values for the first column of the given index. For rows with &lt;tt&gt;stat_name = 'n_diff_pfx_02'&lt;/tt&gt;, &lt;tt&gt;stat_value&lt;/tt&gt; contains the number of unique combinations of the two first columns of the given index, and so on.  (Note that InnoDB indexes, in addition to the columns that were specified when the index was created, contain all columns of the primary key.) &lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; select * from innodb.index_stats where table_name='customer';&lt;br /&gt;+---------------+------------+---------------+---------------------+--------------+------------+-------------+-----------------------------------+&lt;br /&gt;| database_name | table_name | index_name    | stat_timestamp      | stat_name    | stat_value | sample_size | stat_description                  |&lt;br /&gt;+---------------+------------+---------------+---------------------+--------------+------------+-------------+-----------------------------------+&lt;br /&gt;| dbt3          | customer   | i_c_nationkey | 2011-04-01 11:40:29 | n_diff_pfx01 |         25 |         171 | c_nationkey                       |&lt;br /&gt;| dbt3          | customer   | i_c_nationkey | 2011-04-01 11:40:29 | n_diff_pfx02 |     150000 |         171 | c_nationkey,c_custkey             |&lt;br /&gt;| dbt3          | customer   | i_c_nationkey | 2011-04-01 11:40:29 | n_leaf_pages |        171 |        NULL | Number of leaf pages in the index |&lt;br /&gt;| dbt3          | customer   | i_c_nationkey | 2011-04-01 11:40:29 | size         |        225 |        NULL | Number of pages in the index      |&lt;br /&gt;| dbt3          | customer   | PRIMARY       | 2011-04-01 11:40:29 | n_diff_pfx01 |     149911 |         100 | c_custkey                         |&lt;br /&gt;| dbt3          | customer   | PRIMARY       | 2011-04-01 11:40:29 | n_leaf_pages |       1746 |        NULL | Number of leaf pages in the index |&lt;br /&gt;| dbt3          | customer   | PRIMARY       | 2011-04-01 11:40:29 | size         |       1764 |        NULL | Number of pages in the index      |&lt;br /&gt;+---------------+------------+---------------+---------------------+--------------+------------+-------------+-----------------------------------+&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Manually Updating InnoDB Statistics Tables&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Since the statistics tables are normal tables, they can be updated like any other MySQL table.  For our performance regression tests, we have decided to record our own numbers in these tables.  That way, we will be able to get the exact same numbers even if we for some reason need to recreate the statistics tables (e.g, in another database instance).  &lt;br /&gt;&lt;br /&gt;When we manually update the statistics tables for the DBT-3 benchmarks, we first turn on persistent statistics and run ANALYZE once for each table.  This inserts all necessary rows in the statistics tables, and we will only need to update the relevant rows.  Here are an example UPDATE statement for each table:&lt;br /&gt;&lt;div class="codebox"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;UPDATE innodb.table_stats SET n_rows=150000 &lt;br /&gt;  WHERE database_name='dbt3' AND table_name='customer' ;&lt;br /&gt;UPDATE innodb.index_stats SET stat_value=25, sample_size=NULL &lt;br /&gt;  WHERE database_name='dbt3' AND table_name='customer' &lt;br /&gt;    AND index_name='i_c_nationkey' AND stat_name='n_diff_pfx01';&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;By convention, I set &lt;tt&gt;index_stats.sample_size&lt;/tt&gt; to &lt;tt&gt;NULL&lt;/tt&gt; to indicate that the value is recorded and not computed by sampling.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;A World of New Possibilities&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As discussed above, InnoDB Persistent Statistics will be very useful in order to ensure that the same query execution plan is used every time a query is executed.  It also gives the users the ability to control when the statistics are recalculated.  However, persistent statistics opens up for even more new possibilities.  &lt;br /&gt;&lt;br /&gt;We will be able to use this feature to extend our testing of the query optimizer.  By "faking" the statistics, we will be able to explore and test the execution of different query plans without having to actually modify the database.  &lt;br /&gt;&lt;br /&gt;It will also be possible for users to change the statistics in order to force a specific query plan. However, one risks introducing side-effects on other queries so this will have to be done with caution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1508669603650457962-2126423813645872715?l=oysteing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://oysteing.blogspot.com/feeds/2126423813645872715/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://oysteing.blogspot.com/2011/05/innodb-persistent-statistics-save-day.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/2126423813645872715'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/2126423813645872715'/><link rel='alternate' type='text/html' href='http://oysteing.blogspot.com/2011/05/innodb-persistent-statistics-save-day.html' title='InnoDB Persistent Statistics Save the Day'/><author><name>Øystein</name><uri>http://www.blogger.com/profile/11287151295676613909</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://1.bp.blogspot.com/-6jY-m3jYD8Q/TaMIWUFidPI/AAAAAAAAAAQ/AYrUPQePJQQ/s220/bilde.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-XMJQoCj5FtQ/TcE-9K1tbHI/AAAAAAAAAAw/so-You4bMMM/s72-c/stat100_cv.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1508669603650457962.post-8091655527100412193</id><published>2011-04-14T09:21:00.000+02:00</published><updated>2011-10-05T14:50:35.444+02:00</updated><title type='text'>More Stable Query Execution Times by Improving InnoDB Statistics</title><content type='html'>&lt;style type="text/css"&gt;&lt;!--div.codebox {height: 100%;width: 100%;border: 1px solid;background-color: #EEEEEE;padding: 4px;}--&gt;&lt;/style&gt; As part of ensuring that changes to the MySQL Optimizer does not introduce performance regressions, we have started a project to create a new performance regression test suite. One component in this test suite will be the &lt;a href="http://osdldbt.sourceforge.net/#dbt3"&gt;DBT-3 test suite&lt;/a&gt;. However, we observed that the execution times for DBT-3 varied so much that, in its present form, it was not usable for detecting performance regressions.&lt;br /&gt;&lt;br /&gt;In order to get a better understanding of what was going on, I looked closer at one of the queries that were run, Query 8.  For this particular query, which contains an 8-table join, the execution times varied from 1 minute to 5 hours!  Looking at 100 runs of this query, I detected 8 different query execution plans.  Four of these plans represented differences in which sequence the tables were joined, while the last four differed from the first four with respect to which indexes were used.&lt;br /&gt;&lt;br /&gt;Since the MySQL Optimizer is cost based, when the execution plans vary, this is usually because the underlying statistics reported from the storage engine varies.  One way to investigate the statistics that the optimizer bases its decisions on, is to use the SHOW INDEX command:&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox" style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; show index from customer;&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;br /&gt;| Table    | Non_unique | Key_name      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;br /&gt;| customer |          0 | PRIMARY       |            1 | c_custkey   | A         |      150000 |     NULL | NULL   |      | BTREE      |         |               |&lt;br /&gt;| customer |          1 | i_c_nationkey |            1 | c_nationkey | A         |          47 |     NULL | NULL   | YES  | BTREE      |         |               |&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;In the above example, the &lt;tt&gt;customer&lt;/tt&gt; table has two indexes, a primary index on &lt;tt&gt;c_custkey&lt;/tt&gt; and a secondary index on &lt;tt&gt;c_nationkey&lt;/tt&gt;.  The most important thing to note from this output, is the estimated number of different key values, &lt;it&gt;cardinality&lt;/it&gt;. For the two indexed columns of the &lt;tt&gt;customer&lt;/tt&gt; table, the cardinality is 150,000 and 47, respectively.&lt;br /&gt;&lt;br /&gt;The InnoDB statistics are calculated on-the-fly the first time a table is used after the server has been started.  The statistics may be automatically recalculated at various times, and ANALYZE TABLE can be used to force a recalculation. For a description of how InnoDB calculates its statistics, see the &lt;a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-other-changes-statistics-estimation.html"&gt; MySQL 5.5 Reference Manual&lt;/a&gt;.  The important thing to note is that there is a system variable, &lt;a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_stats_sample_pages"&gt;&lt;tt&gt;innodb_stats_sample_pages&lt;/tt&gt;&lt;/a&gt;, that controls the accuracy of the statistics.  It determines the number of index pages that are sampled in order to calculate the statistics.  The default value is 8.&lt;br /&gt;&lt;br /&gt;Continuing with the example above, I ran ANALYZE on the &lt;tt&gt;customer&lt;/tt&gt; table, and here is what I got:&lt;br /&gt;&lt;br /&gt;&lt;div class=codebox style="overflow: scroll;"&gt;&lt;pre&gt;&lt;span style="font-size: small;"&gt;mysql&amp;gt; analyze table customer;&lt;br /&gt;+---------------+---------+----------+----------+&lt;br /&gt;| Table         | Op      | Msg_type | Msg_text |&lt;br /&gt;+---------------+---------+----------+----------+&lt;br /&gt;| dbt3.customer | analyze | status   | OK       |&lt;br /&gt;+---------------+---------+----------+----------+&lt;br /&gt;1 row in set (0.03 sec)&lt;br /&gt;&lt;br /&gt;mysql&amp;gt; show index from customer;&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;br /&gt;| Table    | Non_unique | Key_name      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;br /&gt;| customer |          0 | PRIMARY       |            1 | c_custkey   | A         |      150000 |     NULL | NULL   |      | BTREE      |         |               |&lt;br /&gt;| customer |          1 | i_c_nationkey |            1 | c_nationkey | A         |         134 |     NULL | NULL   | YES  | BTREE      |         |               |&lt;br /&gt;+----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+&lt;br /&gt;2 rows in set (0.01 sec)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;As the observant reader has already noticed, the estimated cardinality of the column &lt;tt&gt;c_nationkey&lt;/tt&gt; has changed significantly.  Running ANALYZE several times, I saw numbers as low as 5 and as high as 135 for this column.&lt;br /&gt;&lt;br /&gt;The question is then whether we can get better and more stable statistics by increasing the setting of &lt;tt&gt;innodb_stats_sample_pages&lt;/tt&gt;.  I increased the value to 100, and ran ANALYZE 100 times.  This time, the estimated cardinality was between 22 and 84, and 80% of the time it was between 30 and 60.&lt;br /&gt;&lt;br /&gt;So how did this affect the execution plan for query 8?  Running the query another 100 times with this new setting of innodb_stats_sample_pages, gave only 3 different execution plans, all with the same join ordering.  The remaining difference was in which index was used to access one of the tables. I will discuss how to further reduce the number of plans in a future blog post.&lt;br /&gt;&lt;br /&gt;Note that increasing &lt;tt&gt;innodb_stats_sample_pages&lt;/tt&gt; will cause the calculation of statistics to take longer time.  In my case, sampling 100 pages did not seem to take a noticeable amount of time, and this should normally not cause any problems when ANALYZE TABLE is run explicitly.  However, if InnoDB decides to do an automatic recalculation, the increased time may have some unwanted impact.  Hence, this variable should be used with care, and its value should not be set higher than necessary to get reasonably accurate estimates.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1508669603650457962-8091655527100412193?l=oysteing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://oysteing.blogspot.com/feeds/8091655527100412193/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://oysteing.blogspot.com/2011/04/more-stable-query-execution-time-by.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/8091655527100412193'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1508669603650457962/posts/default/8091655527100412193'/><link rel='alternate' type='text/html' href='http://oysteing.blogspot.com/2011/04/more-stable-query-execution-time-by.html' title='More Stable Query Execution Times by Improving InnoDB Statistics'/><author><name>Øystein</name><uri>http://www.blogger.com/profile/11287151295676613909</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://1.bp.blogspot.com/-6jY-m3jYD8Q/TaMIWUFidPI/AAAAAAAAAAQ/AYrUPQePJQQ/s220/bilde.jpg'/></author><thr:total>5</thr:total></entry></feed>
