MySQL’s query cache explained
An often misused and misunderstood aspect of MySQL is the query cache. I’ve seen blog post after blog post online talking about query caching as the most integral and important feature in MySQL. Many of these same posts advocate cranking the variables to the max to give you “ultimate performance.” One of the worst things you can do to a MySQL server is crank your variables up and hope for the best. I’ll try to clear some things up here.
The MySQL query cache is available in MySQL 4.0, 4.1, 5.0, 5.1, and 6.0 (3.23 has no query cache). The goal of the query cache is to hold result sets that are retrieved repeatedly. Since the data is held in memory, MySQL only feeds the data from memory (which is fast) into your application without digging into the tables themselves (which is slow). The result set from the query you’re running and the query in the query cache must be completely identical, or MySQL will pull the data as it traditionally does from the tables.
Queries and result sets must meet certain criteria to make it into the query cache:
- Must not be prepared statements (See 12.7. SQL Syntax for Prepared Statements)
- Subqueries are not cached, only the outer query is cached
- Queries that are run from stored procedures, functions, or triggers are not cached (applies to versions 5.0+ only)
- The result set must be equal to or smaller than the query_cache_limit (more on this below)
- The query cannot refer to the mysql database
- Queries cannot use user variables, user-defined functions, temporary tables or tables with column-level privileges
Besides these rules, all other queries are approved to enter the query cache. This includes wild things such as views, joins, and queries with subqueries.
The MySQL query cache is controlled by several variables:
- query_alloc_block_size (defaults to 8192): the actual size of the memory blocks created for result sets in the query cache (don’t adjust)
- query_cache_limit (defaults to 1048576): queries with result sets larger than this won’t make it into the query cache
- query_cache_min_res_unit (defaults to 4096): the smallest size (in bytes) for blocks in the query cache (don’t adjust)
- query_cache_size (defaults to 0): the total size of the query cache (disables query cache if equal to 0)
- query_cache_type (defaults to 1): 0 means don’t cache, 1 means cache everything, 2 means only cache result sets on demand
- query_cache_wlock_invalidate (defaults to FALSE): allows SELECTS to run from query cache even though the MyISAM table is locked for writing
Explaining the query_cache_type is a little rough. If the query_cache_type is 0:
- and the query_cache_size is 0: no memory is allocated and the cache is disabled
- and the query_cache_size is greater than 0: the memory is allocated but the cache is disabled
If the query_cache_type is 1:
- and the query_cache_size is 0: no memory is allocated and the cache is disabled
- and the query_cache_size is greater than 0: the cache is enabled and all queries that don’t use SQL_NO_CACHE will be cached automatically
If the query_cache_type is 2:
- and the query_cache_size is 0: no memory is allocated and the cache is disabled
- and the query_cache_size is greater than 0: the cache is enabled and queries must use SQL_CACHE to be cached
Now that we have the variables behind us, how can we tell if we’re using the query cache appropriately? Each time a query runs against the query cache, the server will increment the Qcache_hits status variable instead of Com_select (which is incremented when a normal SELECT runs). If the table changes for any reason, its data is rendered invalid and is dropped from the query cache.
It’s vital to understand the performance implications of the query cache:
Purging the cache
If the query cache fills completely, it will be flushed entirely – this is a significant performance hit as many memory addresses will have to be adjusted. Check your Qcache_lowmem_prunes in your status variables and increase the query_cache_size if you find yourself pruning the query cache more than a few times per hour.
Query cache utilization
There’s a simple formula to calculate your query cache efficiency in percentage form:
Qcache_hits / (Com_select + Qcache_hits) x 100
A query cache efficiency percentage of 20% or less points to a performance problem. You may want to shrink your result sets by building more restrictive queries. If that isn’t possible, then you can increase your query_cache_limit so that more of your larger result sets actually make it into the cache. Keep in mind, however, that this will increase your prunes (see the previous paragraph) and can reduce performance. Increasing the query_cache_limit by small amounts and then recalculating your efficiency is a good idea.
Fighting fragmentation
As queries move in and out of the query cache, the memory may become fragmented. This is normally signified by an increase in slow queries, but your query cache efficiency percentage still remains high. In this situation, run FLUSH QUERY CACHE from the MySQL client and keep monitoring your efficiency. If this doesn’t help, you may be better off flushing the cache entirely with RESET QUERY CACHE.
I’ve tried to piece quite a bit of documentation and DBA knowledge into this article, but you may benefit from reviewing the following documentation sections on MySQL.com: 5.2.3. System Variables, 5.2.5 Status Variables, and 6.5.4. The MySQL Query Cache.
Texas1EMT
Texas1EMT
majorh at gmail
rackerhacker / +1-830-584-1564
rackerhacker
We make medical systems for physicians which store a fair amount of data in MySQL. However, only a small portion of the data (<1%) is used on a given day.
So here is my question related to the article. Is only the results of the query stored using the query cache or is MySQL aware that the data is in memory and will seek it there first. For example, when someone selects a patient and goes into their record, we run a query to look up that patients information specifying the patient exactly.
Could I run a larger query resulting in all the patients for a given day, and then when I run the query for that specific patient it would give the response I am looking for?
My second question is, what happens if the underlying data changes, does the query cache up date itself?
Thanks
Moe.
Moe,
The query cache only holds the results of the queries, but it is very picky. For example, these two queries are seen as completely different by the query cache:
SELECT column FROM table;
select column from table;
If you make a large select, and then make a smaller select later, the second query will not be able to utilize the cached result from the first. When tables are removed or rows are updated, the related data in the query cache is removed, which can be an expensive operation when multiplied over thousands of queries.
I know this is off topic, but I was wondering if you have ever seen the following:
[MySQL][ODBC 3.51 Driver]Can’t create a new thread (errno 12); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug
The server is a W2k3 server, with Dual Xeon Duo Core with 12 gb ram. Memory usage is at 10gb available when it crashes. I don’t think this is a hardware limitation.
I have attached our my.ini file. Currently, the system crashes at 1280 connections. Is there any way we can adjust the file so we can maximize the connections until we re-develop our software?
The My.ini has the following settings:
max_connections=2047
well writen.
thanks!