Indexes are used to find rows with a specific value of one column fast. Without an index MySQL has to start with the first record and then read through the whole table until it finds the relevant rows. The bigger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly get a position to seek to in the middle of the datafile without having to look at all the data. If a table has 1000 rows, this is at least 100 times faster than reading sequentially. Note that if you need to access almost all 1000 rows it is faster to read sequentially because we then avoid disk seeks.
All MySQL indexes (PRIMARY
, UNIQUE
, and
INDEX
) are stored in B-trees. Strings are automatically prefix-
and end-space compressed. See section 6.5.7 CREATE INDEX
Syntax.
Indexes are used to:
WHERE
clause.
MAX()
or MIN()
value for a specific indexed
column. This is optimised by a preprocessor that checks if you are
using WHERE
key_part_# = constant on all key parts < N. In this case
MySQL will do a single key lookup and replace the MIN()
expression with a constant. If all expressions are replaced with
constants, the query will return at once:
SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10
ORDER BY
key_part_1,key_part_2
). The key is read in reverse order if all key
parts are followed by DESC
. See section 5.2.8 How MySQL Optimises ORDER BY
.
SELECT key_part3 FROM table_name WHERE key_part1=1
Suppose you issue the following SELECT
statement:
mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
If a multiple-column index exists on col1
and col2
, the
appropriate rows can be fetched directly. If separate single-column
indexes exist on col1
and col2
, the optimiser tries to
find the most restrictive index by deciding which index will find fewer
rows and using that index to fetch the rows.
If the table has a multiple-column index, any leftmost prefix of the
index can be used by the optimiser to find rows. For example, if you
have a three-column index on (col1,col2,col3)
, you have indexed
search capabilities on (col1)
, (col1,col2)
, and
(col1,col2,col3)
.
MySQL can't use a partial index if the columns don't form a
leftmost prefix of the index. Suppose you have the SELECT
statements shown here:
mysql> SELECT * FROM tbl_name WHERE col1=val1; mysql> SELECT * FROM tbl_name WHERE col2=val2; mysql> SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;
If an index exists on (col1,col2,col3)
, only the first query
shown above uses the index. The second and third queries do involve
indexed columns, but (col2)
and (col2,col3)
are not
leftmost prefixes of (col1,col2,col3)
.
MySQL also uses indexes for LIKE
comparisons if the argument
to LIKE
is a constant string that doesn't start with a wildcard
character. For example, the following SELECT
statements use indexes:
mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Patrick%"; mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Pat%_ck%";
In the first statement, only rows with "Patrick" <= key_col <
"Patricl"
are considered. In the second statement, only rows with
"Pat" <= key_col < "Pau"
are considered.
The following SELECT
statements will not use indexes:
mysql> SELECT * FROM tbl_name WHERE key_col LIKE "%Patrick%"; mysql> SELECT * FROM tbl_name WHERE key_col LIKE other_col;
In the first statement, the LIKE
value begins with a wildcard
character. In the second statement, the LIKE
value is not a
constant.
MySQL 4.0 does another optimisation on LIKE
. If you use
... LIKE "%string%"
and string
is longer than 3 characters,
MySQL will use the Turbo Boyer-Moore
algorithm to initialise the
pattern for the string and then use this pattern to perform the search
quicker.
Searching using column_name IS NULL
will use indexes if column_name
is an index.
MySQL normally uses the index that finds the least number of rows. An
index is used for columns that you compare with the following operators:
=
, >
, >=
, <
, <=
, BETWEEN
, and a
LIKE
with a non-wildcard prefix like 'something%'
.
Any index that doesn't span all AND
levels in the WHERE
clause
is not used to optimise the query. In other words: To be able to use an
index, a prefix of the index must be used in every AND
group.
The following WHERE
clauses use indexes:
... WHERE index_part1=1 AND index_part2=2 AND other_column=3 ... WHERE index=1 OR A=10 AND index=2 /* index = 1 OR index = 2 */ ... WHERE index_part1='hello' AND index_part_3=5 /* optimised like "index_part1='hello'" */ ... WHERE index1=1 and index2=2 or index1=3 and index3=3; /* Can use index on index1 but not on index2 or index 3 */
These WHERE
clauses do NOT use indexes:
... WHERE index_part2=1 AND index_part3=2 /* index_part_1 is not used */ ... WHERE index=1 OR A=10 /* Index is not used in both AND parts */ ... WHERE index_part1=1 OR index_part2=10 /* No index spans all rows */
Note that in some cases MySQL will not use an index, even if one would be available. Some of the cases where this happens are:
LIMIT
to only retrieve
part of the rows, MySQL will use an index anyway, as it can
much more quickly find the few rows to return in the result.
Posted by Michael Schröpl on Tuesday July 23 2002, @3:42pm | [Delete] [Edit] |
One case when mySQL 3.23 does not use an index is
if it has to implicitly convert types.
Imagine you have a column of a VARCHAR type but
query this in the form "SELECT * FROM tablename
WHERE columnname = 123"; mySQL implicitly converts
123 to "123" but then does a full table scan.
Posted by Ed Soniat on Friday November 1 2002, @8:03am | [Delete] [Edit] |
This section should include information about the cost of indexes. The size of an index and the cost of keeping an index current. An over view of how to determine when to use an index would be good too.
Posted by Henning Schulzrinne on Friday February 28 2003, @3:31pm | [Delete] [Edit] |
It appears that queries with OR don't use indices:
describe SELECT person FROM person WHERE email1='foo' OR email2='foo';
+--------+------+---------------+------+---------+------+-------+-------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+--------+------+---------------+------+---------+------+-------+-------------+
| person | ALL | email1,email2 | NULL | NULL | NULL | 57051 | Using where |
+--------+------+---------------+------+---------+------+-------+-------------+
Posted by Stefan Silke on Thursday March 27 2003, @5:39am | [Delete] [Edit] |
I had a problem with indexes not being used on table which I was doing a LEFT JOIN on. The table which was causing the slow selects only had 2 rows in it, so I added a few dummy rows, and viola! Queries which previously took about 2 ½ mins now take under ½ a second. Nice optimizing, MySQL.
Posted by petern on Wednesday June 4 2003, @4:21am | [Delete] [Edit] |
The 'not using indexes when using OR' feature is rather annoying... surely it's quicker to use the index than scan a large table? In the example below, MemberID is the primary key, MembershipNumber is an index and Member is a table with over 450,000 rows.
select MembershipNumber, MemberID from Member where MemberID = 123896920 and MembershipNumber = 1029540;
1 row in set (0.00 sec)
select MembershipNumber, MemberID from Member where MemberID = 123896920 or MembershipNumber = 1029540;
1 row in set (11.49 sec)
Posted by Scott James on Wednesday June 18 2003, @5:13am | [Delete] [Edit] |
With 3.28.56 I was able to use a composite index on two columns, in one special case:
mysql> create table t(a int, b int, c int);
mysql> create index t_idx on t (a,b,c);
mysql> insert into t values (1, 1, 1);
... (load 10 more test rows like this) ...
mysql> explain select * from t where a=1 or b=1;
| table | type | possible_keys | key | key_len | ref
| t | index | t_idx | t_idx | 15 | NULL
However, notice that the composite index is on ALL columns of the table (rarely practical). So, lets create an index on only the first two columns:
mysql> drop index t_idx from t;
mysql> create index t_idx on t(a,b);
mysql> explain select * from t where a=1 or b=1;
| table | type | possible_keys | key | key_len | ref
| t | ALL | t_idx | NULL | NULL | NULL
So, if you absolutely need to use OR to join multiple rows, consider creating a smaller in-memory table (CREATE ... SELECT ... TYPE=HEAP). This could help if you do this kind of join multiple times in a single transaction.