MySQL Manual | 5.2.1 EXPLAIN Syntax (Get Information About a SELECT)

5 MySQL Optimisation
- 5.2 Optimising SELECTs and Other Queries

Previous / Next / Up / Table of Contents

5.2.1 `EXPLAIN` Syntax (Get Information About a `SELECT`)

    EXPLAIN tbl_name
or  EXPLAIN SELECT select_options

EXPLAIN tbl_name is a synonym for DESCRIBE tbl_name or SHOW COLUMNS FROM tbl_name.

When you precede a SELECT statement with the keyword EXPLAIN, MySQL explains how it would process the SELECT, providing information about how tables are joined and in which order.

With the help of EXPLAIN, you can see when you must add indexes to tables to get a faster SELECT that uses indexes to find the records.

You should frequently run ANALYZE TABLE to update table statistics such as cardinality of keys which can affect the choices the optimiser makes. See section 4.5.2 ANALYZE TABLE Syntax.

You can also see if the optimiser joins the tables in an optimal order. To force the optimiser to use a specific join order for a SELECT statement, add a STRAIGHT_JOIN clause.

For non-simple joins, EXPLAIN returns a row of information for each table used in the SELECT statement. The tables are listed in the order they would be read. MySQL resolves all joins using a single-sweep multi-join method. This means that MySQL reads a row from the first table, then finds a matching row in the second table, then in the third table and so on. When all tables are processed, it outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table.

In MySQL version 4.1 the EXPLAIN output was changed to work better with constructs like UNIONs, subqueries and derived tables. Most notable is the addition of two new columns: id and select_type.

Output from EXPLAIN consists of the following columns:

id

SELECT identifier, the sequential number of this SELECT within the query.

select_type

Type of SELECT clause, which can be any of the following:

SIMPLE: Simple SELECT (without UNIONs or subqueries).
PRIMARY: Outermost SELECT.
UNION: Second and further UNION SELECTs.
DEPENDENT UNION: Second and further UNION SELECTSs, dependent on outer subquery.
SUBQUERY: First SELECT in subquery.
DEPENDENT SUBQUERY: First SELECT, dependent on outer subquery.
DERIVED: Derived table SELECT.

table

The table to which the row of output refers.

type

The join type. The different join types are listed here, ordered from best to worst type:

system

The table has only one row (= system table). This is a special case of the const join type.

const

The table has at most one matching row, which will be read at the start of the query. Because there is only one row, values from the column in this row can be regarded as constants by the rest of the optimiser. const tables are very fast as they are read only once! const is used when you compare all parts of a PRIMARY/UNIQUE key with constants:

SELECT * FROM const_table WHERE primary_key=1;

SELECT * FROM const_table WHERE primary_key_part1=1 and primary_key_part2=2;

eq_ref

One row will be read from this table for each combination of rows from the previous tables. This is the best possible join type, other than the const types. It is used when all parts of an index are used by the join and the index is UNIQUE or a PRIMARY KEY. eq_ref can be used for indexed columns that is compared with =. The compared item may be a constant or an expression that uses columns from tables that are read before this table. In the following examples, ref_table will be able to use eq_ref

SELECT * FROM ref_table,other_table WHERE
ref_table.key_column=other_table.column;

SELECT * FROM ref_table,other_table WHERE
ref_table.key_column_part1=other_table.column AND
ref_table.key_column_part2=1;

ref

All rows with matching index values will be read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key, or if the key is not UNIQUE or a PRIMARY KEY (in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this join type is good. ref can be used for indexed columns that is compared with =. In the following examples, ref_table will be able to use ref

SELECT * FROM ref_table WHERE key_column=expr;

SELECT * FROM ref_table,other_table WHERE
ref_table.key_column=other_table.column;

SELECT * FROM ref_table,other_table WHERE
ref_table.key_column_part1=other_table.column AND
ref_table.key_column_part2=1;

ref_or_null

Like ref, but with the addition that we will do an extra search for rows with NULL. See section 5.2.5 How MySQL Optimises IS NULL.

SELECT * FROM ref_table WHERE key_column=expr OR key_column IS NULL;

This optimization is new for MySQL 4.1.1 and is mostly used when resolving sub queries.

range

Only rows that are in a given range will be retrieved, using an index to select the rows. The key column indicates which index is used. The key_len contains the longest key part that was used. The ref column will be NULL for this type. range can be used for when an key column is compared to a constant with =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN and IN.

SELECT * FROM range_table WHERE key_column = 10;

SELECT * FROM range_table WHERE key_column BETWEEN 10 and 20;

SELECT * FROM range_table WHERE key_column IN (10,20,30);

SELECT * FROM range_table WHERE key_part1= 10 and key_part2 IN (10,20,30);

index

This is the same as ALL, except that only the index tree is scanned. This is usually faster than ALL, as the index file is usually smaller than the datafile. This can be used when the query only uses columns that are part of one index.

ALL

A full table scan will be done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. You normally can avoid ALL by adding more indexes, so that the row can be retrieved based on constant values or column values from earlier tables.

possible_keys

The possible_keys column indicates which indexes MySQL could use to find the rows in this table. Note that this column is totally independent of the order of the tables. That means that some of the keys in possible_keys may not be usable in practice with the generated table order. If this column is empty, there are no relevant indexes. In this case, you may be able to improve the performance of your query by examining the WHERE clause to see if it refers to some column or columns that would be suitable for indexing. If so, create an appropriate index and check the query with EXPLAIN again. See section 6.5.4 ALTER TABLE Syntax. To see what indexes a table has, use SHOW INDEX FROM tbl_name.

key

The key column indicates the key (index) that MySQL actually decided to use. The key is NULL if no index was chosen. To force MySQL to use an key listed in the possible_keys column, use USE KEY/IGNORE KEY in your query. See section 6.4.1 SELECT Syntax. Also, running myisamchk --analyze (see section 4.4.6.1 myisamchk Invocation Syntax) or ANALYZE TABLE (see section 4.5.2 ANALYZE TABLE Syntax) on the table will help the optimiser choose better indexes.

key_len

The key_len column indicates the length of the key that MySQL decided to use. The length is NULL if the key is NULL. Note that this tells us how many parts of a multi-part key MySQL will actually use.

ref

The ref column shows which columns or constants are used with the key to select rows from the table.

rows

The rows column indicates the number of rows MySQL believes it must examine to execute the query.

Extra

This column contains additional information of how MySQL will resolve the query. Here is an explanation of the different text strings that can be found in this column:

Distinct

MySQL will not continue searching for more rows for the current row combination after it has found the first matching row.

Not exists

MySQL was able to do a LEFT JOIN optimisation on the query and will not examine more rows in this table for the previous row combination after it finds one row that matches the LEFT JOIN criteria. Here is an example for this:

SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;

Assume that t2.id is defined with NOT NULL. In this case MySQL will scan t1 and look up the rows in t2 through t1.id. If MySQL finds a matching row in t2, it knows that t2.id can never be NULL, and will not scan through the rest of the rows in t2 that has the same id. In other words, for each row in t1, MySQL only needs to do a single lookup in t2, independent of how many matching rows there are in t2.

range checked for each record (index map: #)

MySQL didn't find a real good index to use. It will, instead, for each row combination in the preceding tables, do a check on which index to use (if any), and use this index to retrieve the rows from the table. This isn't very fast but is faster than having to do a join without an index.

Using filesort

MySQL will need to do an extra pass to find out how to retrieve the rows in sorted order. The sort is done by going through all rows according to the join type and storing the sort key + pointer to the row for all rows that match the WHERE. Then the keys are sorted. Finally the rows are retrieved in sorted order.

Using index

The column information is retrieved from the table using only information in the index tree without having to do an additional seek to read the actual row. This can be done when all the used columns for the table are part of the same index.

Using temporary

To resolve the query MySQL will need to create a temporary table to hold the result. This typically happens if you do an ORDER BY on a different column set than you did a

GROUP
BY

on.

Using where

A WHERE clause will be used to restrict which rows will be matched against the next table or sent to the client. If you don't have this information and the table is of type ALL or index, you may have something wrong in your query (if you don't intend to fetch/examine all rows from the table).

If you want to get your queries as fast as possible, you should look out for Using filesort and Using temporary.

You can get a good indication of how good a join is by multiplying all values in the rows column of the EXPLAIN output. This should tell you roughly how many rows MySQL must examine to execute the query. This number is also used when you restrict queries with the max_join_size variable. See section 5.5.2 Tuning Server Parameters.

The following example shows how a JOIN can be optimised progressively using the information provided by EXPLAIN.

Suppose you have the SELECT statement shown here, that you examine using EXPLAIN:

EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
            tt.ProjectReference, tt.EstimatedShipDate,
            tt.ActualShipDate, tt.ClientID,
            tt.ServiceCodes, tt.RepetitiveID,
            tt.CurrentProcess, tt.CurrentDPPerson,
            tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
            et_1.COUNTRY, do.CUSTNAME
        FROM tt, et, et AS et_1, do
        WHERE tt.SubmitTime IS NULL
            AND tt.ActualPC = et.EMPLOYID
            AND tt.AssignedPC = et_1.EMPLOYID
            AND tt.ClientID = do.CUSTNMBR;

For this example, assume that:

The columns being compared have been declared as follows:

Table Column Column type

tt ActualPC CHAR(10)

tt AssignedPC CHAR(10)

tt ClientID CHAR(10)

et EMPLOYID CHAR(15)

do CUSTNMBR CHAR(15)
The tables have the indexes shown here:

Table Index

tt ActualPC

tt AssignedPC

tt ClientID

et EMPLOYID (primary key)

do CUSTNMBR (primary key)
The tt.ActualPC values aren't evenly distributed.

Initially, before any optimisations have been performed, the EXPLAIN statement produces the following information:

table type possible_keys                key  key_len ref  rows  Extra
et    ALL  PRIMARY                      NULL NULL    NULL 74
do    ALL  PRIMARY                      NULL NULL    NULL 2135
et_1  ALL  PRIMARY                      NULL NULL    NULL 74
tt    ALL  AssignedPC,ClientID,ActualPC NULL NULL    NULL 3872
      range checked for each record (key map: 35)

Because type is ALL for each table, this output indicates that MySQL is doing a full join for all tables! This will take quite a long time, as the product of the number of rows in each table must be examined! For the case at hand, this is 74 * 2135 * 74 * 3872 = 45,268,558,720 rows. If the tables were bigger, you can only imagine how long it would take.

One problem here is that MySQL can't (yet) use indexes on columns efficiently if they are declared differently. In this context, VARCHAR and CHAR are the same unless they are declared as different lengths. Because tt.ActualPC is declared as CHAR(10) and et.EMPLOYID is declared as CHAR(15), there is a length mismatch.

To fix this disparity between column lengths, use ALTER TABLE to lengthen ActualPC from 10 characters to 15 characters:

mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);

Now tt.ActualPC and et.EMPLOYID are both VARCHAR(15). Executing the EXPLAIN statement again produces this result:

table type   possible_keys   key     key_len ref         rows    Extra
tt    ALL    AssignedPC,ClientID,ActualPC NULL NULL NULL 3872    Using where
do    ALL    PRIMARY         NULL    NULL    NULL        2135
      range checked for each record (key map: 1)
et_1  ALL    PRIMARY         NULL    NULL    NULL        74
      range checked for each record (key map: 1)
et    eq_ref PRIMARY         PRIMARY 15      tt.ActualPC 1

This is not perfect, but is much better (the product of the rows values is now less by a factor of 74). This version is executed in a couple of seconds.

A second alteration can be made to eliminate the column length mismatches for the tt.AssignedPC = et_1.EMPLOYID and tt.ClientID = do.CUSTNMBR comparisons:

mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
    ->                MODIFY ClientID   VARCHAR(15);

Now EXPLAIN produces the output shown here:

table type   possible_keys   key      key_len ref           rows Extra
et    ALL    PRIMARY         NULL     NULL    NULL          74
tt    ref    AssignedPC,     ActualPC 15      et.EMPLOYID   52   Using where
             ClientID,
             ActualPC
et_1  eq_ref PRIMARY         PRIMARY  15      tt.AssignedPC 1
do    eq_ref PRIMARY         PRIMARY  15      tt.ClientID   1

This is almost as good as it can get.

The remaining problem is that, by default, MySQL assumes that values in the tt.ActualPC column are evenly distributed, and that isn't the case for the tt table. Fortunately, it is easy to tell MySQL about this:

shell> myisamchk --analyze PATH_TO_MYSQL_DATABASE/tt
shell> mysqladmin refresh

Now the join is perfect, and EXPLAIN produces this result:

table type   possible_keys key     key_len ref           rows Extra
tt    ALL    AssignedPC    NULL    NULL    NULL          3872 Using where
             ClientID,
             ActualPC
et    eq_ref PRIMARY       PRIMARY 15      tt.ActualPC   1
et_1  eq_ref PRIMARY       PRIMARY 15      tt.AssignedPC 1
do    eq_ref PRIMARY       PRIMARY 15      tt.ClientID   1

Note that the rows column in the output from EXPLAIN is an educated guess from the MySQL join optimiser. To optimise a query, you should check if the numbers are even close to the truth. If not, you may get better performance by using STRAIGHT_JOIN in your SELECT statement and trying to list the tables in a different order in the FROM clause.

User Comments

Posted by John Manko on Tuesday July 8 2003, @5:26pm

[Delete] [Edit]

I like the output that the EXPLAIN command produces; however, there doesn't seem to be functionality to print that to a file. I request the following feature:

EXPLAIN tbl_name [INTO file_name]

where, if the INTO clause is used, the command will output to file_name instead of stdout;

Posted by Partap Davis on Monday July 14 2003, @11:32am

[Delete] [Edit]

You can do that from the command line easily enough:
shell> mysql my_database -e " explain select ..." > output_file

Posted by Andres on Monday July 14 2003, @2:45pm

[Delete] [Edit]

I think you should say that if the query returns no data, the explain won't work, and will say 'Impossible WHERE noticed after reading const tables'.

Add your own comment.

Top / Previous / Next / Up / Table of Contents

Table	Column	Column type
`tt`	`ActualPC`	`CHAR(10)`
`tt`	`AssignedPC`	`CHAR(10)`
`tt`	`ClientID`	`CHAR(10)`
`et`	`EMPLOYID`	`CHAR(15)`
`do`	`CUSTNMBR`	`CHAR(15)`

Table	Index
`tt`	`ActualPC`
`tt`	`AssignedPC`
`tt`	`ClientID`
`et`	`EMPLOYID` (primary key)
`do`	`CUSTNMBR` (primary key)