LOAD DATA INFILE
SyntaxLOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt' [REPLACE | IGNORE] INTO TABLE tbl_name [FIELDS [TERMINATED BY '\t'] [[OPTIONALLY] ENCLOSED BY ''] [ESCAPED BY '\\' ] ] [LINES [STARTING BY ''] [TERMINATED BY '\n'] ] [IGNORE number LINES] [(col_name,...)]
The LOAD DATA INFILE
statement reads rows from a text file into a
table at a very high speed. If the LOCAL
keyword is specified, it is
interpreted with respect to the client end of the connection. When
LOCAL
is specified, the file is read by the client program on the client
host and sent to the server. If LOCAL
is not specified, the
file must be located on the server host and is read directly by the server.
(LOCAL
is available in MySQL Version 3.22.6 or later.)
For security reasons, when reading text files located on the server, the
files must either reside in the database directory or be readable by all.
Also, to use LOAD DATA INFILE
on server files, you must have the
FILE
privilege on the server host.
See section 4.2.7 Privileges Provided by MySQL.
In MySQL 3.23.49 and MySQL 4.0.2 LOCAL
will only work if you have
not started mysqld
with --local-infile=0
or if you
have not enabled your client to support LOCAL
. See section 4.2.4 Security issues with LOAD DATA LOCAL
.
If you specify the keyword LOW_PRIORITY
, execution of the
LOAD DATA
statement is delayed until no other clients are reading
from the table.
If you specify the keyword CONCURRENT
with a MyISAM
table,
then other threads can retrieve data from the table while LOAD
DATA
is executing. Using this option will of course affect the
performance of LOAD DATA
a bit even if no other thread is using
the table at the same time.
Using LOCAL
will be a bit slower than letting the server access the
files directly, because the contents of the file must be sent over the
connection by the client
to the server. On the other hand, you do not need the
FILE
privilege to load local files.
If you are using MySQL before Version 3.23.24 you can't read from a
FIFO with LOAD DATA INFILE
. If you need to read from a FIFO (for
example the output from gunzip), use LOAD DATA LOCAL INFILE
instead.
You can also load datafiles by using the mysqlimport
utility; it
operates by sending a LOAD DATA INFILE
command to the server. The
--local
option causes mysqlimport
to read datafiles from the
client host. You can specify the --compress
option to get better
performance over slow networks if the client and server support the
compressed protocol.
When locating files on the server host, the server uses the following rules:
Note that these rules mean a file named as `./myfile.txt' is read from
the server's data directory, whereas the same file named as `myfile.txt' is
read from the database directory of the current database. For example,
the following LOAD DATA
statement reads the file `data.txt'
from the database directory for db1
because db1
is the current
database, even though the statement explicitly loads the file into a
table in the db2
database:
mysql> USE db1; mysql> LOAD DATA INFILE "data.txt" INTO TABLE db2.my_table;
The REPLACE
and IGNORE
keywords control handling of input
records that duplicate existing records on unique key values.
If you specify REPLACE
, input rows replace existing rows (in other
words rows that has the same value for a primary or unique index as an
existing row). See section 6.4.8 REPLACE
Syntax.
If you specify IGNORE
, input rows that duplicate an existing row
on a unique key value are skipped. If you don't specify either option,
the behaviour depends on whether or not the LOCAL
keyword is specified.
Without LOCAL
, an error occurs when a duplicate key value is
found, and the rest of the text file is ignored. With LOCAL
,
the default behaviour is the same as if IGNORE
is specified;
this is because the server has no way to stop transmission of the file
in the middle of the operation.
If you want to ignore foreign key constraints during load you can do
SET FOREIGN_KEY_CHECKS=0
before executing LOAD DATA
.
If you use LOAD DATA INFILE
on an empty MyISAM
table, all
non-unique indexes are created in a separate batch (like in
REPAIR
). This normally makes LOAD DATA INFILE
much faster
when you have many indexes. Normally this is very fast, but in some
extreme cases you can create the indexes even faster by turning them off
with ALTER TABLE .. DISABLE KEYS
and use ALTER TABLE .. ENABLE
KEYS
to recreate the indexes.
See section 4.4.6 Using myisamchk
for Table Maintenance and Crash Recovery.
LOAD DATA INFILE
is the complement of SELECT ... INTO OUTFILE
.
See section 6.4.1 SELECT
Syntax.
To write data from a table to a file, use SELECT ... INTO OUTFILE
.
To read the file back into a table, use LOAD DATA INFILE
.
The syntax of the FIELDS
and LINES
clauses is the same for
both commands. Both clauses are optional, but FIELDS
must precede LINES
if both are specified.
If you specify a FIELDS
clause,
each of its subclauses (TERMINATED BY
, [OPTIONALLY] ENCLOSED
BY
, and ESCAPED BY
) is also optional, except that you must
specify at least one of them.
If you don't specify a FIELDS
clause, the defaults are the
same as if you had written this:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
If you don't specify a LINES
clause, the default
is the same as if you had written this:
LINES TERMINATED BY '\n'
Note: If you have generated the text file on a Windows system
you may have to change the above to: LINES TERMINATED BY '\r\n'
as Windows uses two characters as a line terminator. Some programs, like
wordpad
, may use \r
as a line terminator.
If all the lines you want to read in has a common prefix that you want
to skip, you can use LINES STARTING BY prefix_string
for this.
In other words, the defaults cause LOAD DATA INFILE
to act as follows
when reading input:
LINES STARTING BY prefix
is used, read until prefix is found
and start reading at character after prefix. If line doesn't include prefix
it will be skipped.
Conversely, the defaults cause SELECT ... INTO OUTFILE
to act as
follows when writing output:
Note that to write FIELDS ESCAPED BY '\\'
, you must specify two
backslashes for the value to be read as a single backslash.
The IGNORE number LINES
option can be used to ignore lines at
the start of the file. For example, you can use IGNORE 1 LINES
to skip over an initial header line containing column names:
mysql> LOAD DATA INFILE "/tmp/file_name" INTO TABLE test IGNORE 1 LINES;
When you use SELECT ... INTO OUTFILE
in tandem with LOAD
DATA INFILE
to write data from a database into a file and then read
the file back into the database later, the field and line handling
options for both commands must match. Otherwise, LOAD DATA
INFILE
will not interpret the contents of the file properly. Suppose
you use SELECT ... INTO OUTFILE
to write a file with
fields delimited by commas:
mysql> SELECT * INTO OUTFILE 'data.txt' -> FIELDS TERMINATED BY ',' -> FROM ...;
To read the comma-delimited file back in, the correct statement would be:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2 -> FIELDS TERMINATED BY ',';
If instead you tried to read in the file with the statement shown here, it
wouldn't work because it instructs LOAD DATA INFILE
to look for
tabs between fields:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2 -> FIELDS TERMINATED BY '\t';
The likely result is that each input line would be interpreted as a single field.
LOAD DATA INFILE
can be used to read files obtained from
external sources, too. For example, a file in dBASE format will have
fields separated by commas and enclosed in double quotes. If lines in
the file are terminated by newlines, the command shown here
illustrates the field and line handling options you would use to load
the file:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name -> FIELDS TERMINATED BY ',' ENCLOSED BY '"' -> LINES TERMINATED BY '\n';
Any of the field or line handling options may specify an empty string
(''
). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY
and FIELDS ESCAPED BY
values must be a single character. The
FIELDS TERMINATED BY
and LINES TERMINATED BY
values may
be more than one character. For example, to write lines that are
terminated by carriage return-linefeed pairs, or to read a file
containing such lines, specify a LINES TERMINATED BY '\r\n'
clause.
For example, to read a file of jokes, that are separated with a line
of %%
, into an SQL table you can do:
CREATE TABLE jokes (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, joke TEXT NOT NULL); LOAD DATA INFILE "/tmp/jokes.txt" INTO TABLE jokes FIELDS TERMINATED BY "" LINES TERMINATED BY "\n%%\n" (joke);
FIELDS [OPTIONALLY] ENCLOSED BY
controls quoting of fields. For
output (SELECT ... INTO OUTFILE
), if you omit the word
OPTIONALLY
, all fields are enclosed by the ENCLOSED BY
character. An example of such output (using a comma as the field
delimiter) is shown here:
"1","a string","100.20" "2","a string containing a , comma","102.20" "3","a string containing a \" quote","102.20" "4","a string containing a \", quote and comma","102.20"
If you specify OPTIONALLY
, the ENCLOSED BY
character is
used only to enclose CHAR
and VARCHAR
fields:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a \" quote",102.20 4,"a string containing a \", quote and comma",102.20
Note that occurrences of the ENCLOSED BY
character within a
field value are escaped by prefixing them with the ESCAPED BY
character. Also note that if you specify an empty ESCAPED BY
value, it is possible to generate output that cannot be read properly by
LOAD DATA INFILE
. For example, the output just shown above would
appear as shown here if the escape character is empty. Observe that the
second field in the fourth line contains a comma following the quote, which
(erroneously) appears to terminate the field:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20
For input, the ENCLOSED BY
character, if present, is stripped from the
ends of field values. (This is true whether OPTIONALLY
is
specified; OPTIONALLY
has no effect on input interpretation.)
Occurrences of the ENCLOSED BY
character preceded by the
ESCAPED BY
character are interpreted as part of the current field
value. In addition, duplicated ENCLOSED BY
characters occurring
within fields are interpreted as single ENCLOSED BY
characters if the
field itself starts with that character. For example, if ENCLOSED BY
'"'
is specified, quotes are handled as shown here:
"The ""BIG"" boss" -> The "BIG" boss The "BIG" boss -> The "BIG" boss The ""BIG"" boss -> The ""BIG"" boss
FIELDS ESCAPED BY
controls how to write or read special characters.
If the FIELDS ESCAPED BY
character is not empty, it is used to prefix
the following characters on output:
FIELDS ESCAPED BY
character
FIELDS [OPTIONALLY] ENCLOSED BY
character
FIELDS TERMINATED BY
and
LINES TERMINATED BY
values
0
(what is actually written following the escape character is
ASCII '0'
, not a zero-valued byte)
If the FIELDS ESCAPED BY
character is empty, no characters are escaped.
It is probably not a good idea to specify an empty escape character,
particularly if field values in your data contain any of the characters in
the list just given.
For input, if the FIELDS ESCAPED BY
character is not empty, occurrences
of that character are stripped and the following character is taken literally
as part of a field value. The exceptions are an escaped `0' or
`N' (for example, \0
or \N
if the escape character is
`\'). These sequences are interpreted as ASCII 0
(a zero-valued
byte) and NULL
. See below for the rules on NULL
handling.
For more information about `\'-escape syntax, see section 6.1.1 Literals: How to Write Strings and Numbers.
In certain cases, field and line handling options interact:
LINES TERMINATED BY
is an empty string and FIELDS
TERMINATED BY
is non-empty, lines are also terminated with
FIELDS TERMINATED BY
.
FIELDS TERMINATED BY
and FIELDS ENCLOSED BY
values
are both empty (''
), a fixed-row (non-delimited) format is used.
With fixed-row format, no delimiters are used between fields (but you
can still have a line terminator). Instead, column values are written
and read using the ``display'' widths of the columns. For example, if a
column is declared as INT(7)
, values for the column are written
using 7-character fields. On input, values for the column are obtained
by reading 7 characters.
LINES TERMINATED BY
is still used to separate lines. If a line
don't contain all fields, the rest of the fields will be set to their
default values. If you don't have a line terminator, you should set this
to ''
. In this case the text file must contain all fields for
each row.
Fixed-row format also affects handling of NULL
values; see below.
Note that fixed-size format will not work if you are using a multi-byte
character set.
Handling of NULL
values varies, depending on the FIELDS
and
LINES
options you use:
FIELDS
and LINES
values,
NULL
is written as \N
for output and \N
is read
as NULL
for input (assuming the ESCAPED BY
character
is `\').
FIELDS ENCLOSED BY
is not empty, a field containing the literal
word NULL
as its value is read as a NULL
value (this differs
from the word NULL
enclosed within FIELDS ENCLOSED BY
characters, which is read as the string 'NULL'
).
FIELDS ESCAPED BY
is empty, NULL
is written as the word
NULL
.
FIELDS TERMINATED BY
and
FIELDS ENCLOSED BY
are both empty), NULL
is written as an empty
string. Note that this causes both NULL
values and empty strings in
the table to be indistinguishable when written to the file because they are
both written as empty strings. If you need to be able to tell the two apart
when reading the file back in, you should not use fixed-row format.
Some cases are not supported by LOAD DATA INFILE
:
FIELDS TERMINATED BY
and FIELDS ENCLOSED
BY
both empty) and BLOB
or TEXT
columns.
LOAD DATA INFILE
won't be able to interpret the input properly.
For example, the following FIELDS
clause would cause problems:
FIELDS TERMINATED BY '"' ENCLOSED BY '"'
FIELDS ESCAPED BY
is empty, a field value that contains an occurrence
of FIELDS ENCLOSED BY
or LINES TERMINATED BY
followed by the FIELDS TERMINATED BY
value will cause LOAD
DATA INFILE
to stop reading a field or line too early.
This happens because LOAD DATA INFILE
cannot properly determine
where the field or line value ends.
The following example loads all columns of the persondata
table:
mysql> LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;
No field list is specified, so LOAD DATA INFILE
expects input rows
to contain a field for each table column. The default FIELDS
and
LINES
values are used.
If you wish to load only some of a table's columns, specify a field list:
mysql> LOAD DATA INFILE 'persondata.txt' -> INTO TABLE persondata (col1,col2,...);
You must also specify a field list if the order of the fields in the input file differs from the order of the columns in the table. Otherwise, MySQL cannot tell how to match up input fields with table columns.
If a row has too few fields, the columns for which no input field is present
are set to default values. Default value assignment is described in
section 6.5.3 CREATE TABLE
Syntax.
An empty field value is interpreted differently than if the field value is missing:
0
.
Note that these are the same values that result if you assign an empty
string explicitly to a string, numeric, or date or time type explicitly
in an INSERT
or UPDATE
statement.
TIMESTAMP
columns are only set to the current date and time if there
is a NULL
value for the column (that is, \N
), or (for the
first TIMESTAMP
column only) if the TIMESTAMP
column is
omitted from the field list when a field list is specified.
If an input row has too many fields, the extra fields are ignored and
the number of warnings is incremented. Note that before MySQL 4.1.1 the
warnings is just a number to indicate that something went wrong.
In MySQL 4.1.1 you can do SHOW WARNINGS
to get more information for
what went wrong.
LOAD DATA INFILE
regards all input as strings, so you can't use
numeric values for ENUM
or SET
columns the way you can with
INSERT
statements. All ENUM
and SET
values must be
specified as strings!
If you are using the C API, you can get information about the query by
calling the API function mysql_info()
when the LOAD DATA INFILE
query finishes. The format of the information string is shown here:
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
Warnings occur under the same circumstances as when values are inserted
via the INSERT
statement (see section 6.4.3 INSERT
Syntax), except
that LOAD DATA INFILE
also generates warnings when there are too few
or too many fields in the input row. The warnings are not stored anywhere;
the number of warnings can only be used as an indication if everything went
well.
If you get warnings and want to know exactly why you got them, one way
to do this is to use SELECT ... INTO OUTFILE
into another file
and compare this to your original input file.
If you need LOAD DATA
to read from a pipe, you can use the
following trick:
mkfifo /mysql/db/x/x chmod 666 /mysql/db/x/x cat < /dev/tcp/10.1.1.12/4711 > /nt/mysql/db/x/x mysql -e "LOAD DATA INFILE 'x' INTO TABLE x" x
If you are using a version of MySQL older than 3.23.25
you can only do the above with LOAD DATA LOCAL INFILE
.
In MySQL 4.1.1 you can use SHOW WARNINGS
to get a list of the first
max_error_count
warnings. See section 4.5.7.9 SHOW WARNINGS | ERRORS
.
For more information about the efficiency of INSERT
versus
LOAD DATA INFILE
and speeding up LOAD DATA INFILE
,
See section 5.2.10 Speed of INSERT
Queries.
Posted by Chs Merriam on Wednesday December 4 2002, @10:13am | [Delete] [Edit] |
Because the LOAD DATA syntax is also used by the
SELECT .. OUTFILE command, it would be useful to
have some sort of "WITH COLUMN_NAMES" clause.
This would simply exporting into spreadsheets and
other databases. If you wanted to get radical, there
could also be a "WITH COLUMN_INFORMATION"
clause that would allow self describing text files by
having the first two lines describing the column
names and the the types, respectively.
Posted by Ryan Bates on Wednesday February 19 2003, @3:21pm | [Delete] [Edit] |
LOAD DATA ... REPLACE seems to replace the entire row (every column) even when specifying to load data into only one of those columns--this sets every column which you did not specify to its default value. I think a more useful feature would be to replace only the specified columns. I have yet to find a decent work around to update only a couple columns in a few rows using data from a text file. If someone knows of a quick and easy way to do this please let me know at: rbates at artbeats dot com.
Posted by Burlen Loring on Monday April 21 2003, @11:42am | [Delete] [Edit] |
The manual would be greatly improved by describing in detail how to load the contents of a binary file(such as a word doc, or pdf doc) into a table, using load data infile.
Posted by Arcady Glush on Tuesday May 20 2003, @7:22am | [Delete] [Edit] |
Transfer data from MS Access into MySQL. Easy way:
If your MySQL table structure is identical to MS Access one, then create ODBC DSN pointing to MySQL: (Use MySQL ODBC 3.51 Driver DSN, if you don't have one download it from www.mysql.com). After that open Access Database, attach MySQL table using File/Get External Data/Link Tables. Choose file of type: ODBC Databases (last option) then follow the rules for attaching ODBC tables. After that it is quite simple, you can use Access query to insert data into the attached table(if you wish to skip some fileds), or, if you want to transfer entire table - simply highlight all the records in the opened Access source table, copy them to clipboard (Ctrl+C), then open attached MySQL table and select paste append option. The entire process can be done even quicker then reading the above instructions.
If your attached MySQL table has all the data marked #Deleted - just ignore that.
I have tested it with MySQL 4.0 and MS Access97, but I am pretty sure it will work with Access2000/2002
If you have questions or problems, just drop me an email.
glush@optushome.com.au or arcady_glush@national.com.au (Australia)