Beginners Guide to Exporting and Importing Data in MySQL
Exporting and Importing Data
The two most common ways to accomplish export and import operations are:
- Using SELECT … INTO OUTFILE to export data to files
- Using the LOAD DATA INFILE statement to import data from files
Reasons to Export Data
Database exports are useful for copying your databases to another server. You can transfer a database to a server running on another host, which is the most typical task when exporting data. You can also transfer data to a different server running on the same host. You might do this if you are testing a server for a new release of MySQL and want to use it with some real data from your production server. You can also load data into external applications; a data export is also useful for transferring data from one RDBMS to another.
Exporting Data by Using SELECT with INTO OUTFILE
You can use a SELECT statement with the INTO OUTFILE clause to write the results set directly into a file. To use SELECT in this way, place the INTO OUTFILE clause before the FROM clause. SELECT with INTO OUTFILE writes result set directly into a file. It assumes the file path is in the database data directory unless otherwise specified
SELECT * INTO OUTFILE '/tmp/City.txt' FROM City;
The output file has the following characteristics:
- The file is written to the server host, instead of sending the file over the network to the client. The file must not already exist. - The server writes a new file on the server host. To run the SELECT … INTO OUTFILE statement, you must connect to the server by using an account that has the FILE privilege. - MySQL creates the file with permissions as follows:
- The account running the MySQL process owns the file.
- The file is world-readable.
- The file contains one line per row selected by the statement. By default, column values are delimited by tab characters, and lines are terminated with newlines.
Using Data File Format Specifiers
SELECT…INTO OUTFILE assumes a default data file format in which column values are separated by tab characters and records are terminated by newlines. To use SELECT…INTO OUTFILE to write a file with different separators or terminators, specify the format output format with FIELDS and LINES clauses.
You can choose alternative delimiters:
FIELDS TERMINATED BY 'string' ENCLOSED BY 'char' ESCAPED BY 'char' LINES TERMINATED BY 'string'
The FIELDS clause specifies how columns are represented.
- TERMINATED BY specifies the field delimiter and is the tab character by default.
- ENCLOSED BY specifies how column values are quoted. The default setting uses no quotes (that is, the default value is the empty string).
- ESCAPED BY indicates the escape character used when indicating non-printing characters such as a newline or tab character. The default escape character is the backslash (\) character.
The LINES TERMINATED BY clause specifies the row delimiter, which is a newline by default.
MySQL uses a backslash character to escape special characters, so you must represent characters such as newline and tab as ‘\n’ and ‘\t’, respectively. Similarly, to represent a literal backslash, you must escape it as follows: ‘\\’.
You can use all of the sequences shown in the table below alone or within a longer string, with the exception of \N, which is understood as NULL only when it appears alone.
|\0||NULL (zero) byte|
Common line terminators include the newline and the carriage return/newline pair. The default newline terminators are common on Linux systems, and carriage return/newline pairs are common on Windows systems.
The ESCAPED BY clause controls only the output of values in the data file; it does not change how MySQL interprets special characters in the statement. For example, if you specify a data file escape character of ‘@’ by writing ESCAPED BY ‘@’, that does not mean you must use ‘@’to escape special characters elsewhere in the statement. You must use MySQL’s escape character (the backslash: \) to escape special characters in the statement by using syntax such as LINES TERMINATED BY ‘\r\n’ rather than LINES TERMINATED BY ‘@[email protected]’.
Importing Data by Using LOAD DATA INFILE
The LOAD DATA INFILE statement reads the values from a data file into a table. LOAD DATA INFILE is the reverse operation of SELECT … INTO OUTFILE. For example:
LOAD DATA INFILE '/tmp/City.txt' INTO TABLE City FIELDS TERMINATED BY ',‘;
The above example, assumes the file is located on the server host in the database data directory.
**Importing Tab-Delimited or Comma-Separated Files **To import a data file containing tab-delimited or comma-separated table data, use the LOAD DATA INFILEcommand. The most important characteristics of the file are:
- The column value separators
- The line separator
- The characters that enclose the values(for example,quotation marks)
- Whether the column names are specified in the file
- Whether there is aheader indicating rows of the table to skip before importing
- The location of the file
- Whether privileges are required to access the file
- The order of the columns
- Whether the number of columns in the file and the table match
Importing Data from the Client Host
Importing data from the client host is slower than importing from the server’s file system because the import file is first uploaded to the server. The server creates a copy of the file in its temporary directory. Once the transmission has started the server cannot stop it, so data interpretation and duplicate key errors that usually terminate the import are flagged as warnings instead and processing continues.
You need to take steps to avoid security risks by ensuring the server cannot read privileged files from the client. The LOAD DATA INFILE statement allows the server process to copy and read the contents of the file named in the statement. A malicious administrator could create a modified version of the server code that could read any file from the client, not just the file specified in the statement. You can disable all LOAD DATA LOCAL statements from the server side by starting mysqld with the –local-infile=0 option. Limit file import and export operations to a specific directory by adding the directory path to the secure_file_priv system variable.
Skipping or Transforming Input Data
Ignoring Data File Lines
To ignore the initial part of the data file, you can use the IGNORE n LINES clause, where n is an integer that indicates the number of input lines to ignore. Use this clause when a file begins with a row of column names rather than data values.
To ignore lines in the data file, use IGNORE n LINES:
mysql> LOAD DATA INFILE /tmp/City.txt' -> INTO TABLE City IGNORE 2 LINES; Query OK, 2231 rows affected (0.01 sec)
There are 2233 rows in the file, but only 2231 are loaded.
Ignoring or Transforming Column Values
You can provide user variables in the column list and in the optional SET clause, which has a syntax similar to the SET clause of the UPDATE statement. LOAD DATA INFILEtransforms data values that it reads from the file by processing the values contained in the user variables before inserting them into the table. To assign an input data column to a user variable rather than to a table column, provide the name of a user variable in the column list. If the column is assigned to a user variable that is not used in a SET expression, the statement ignores the value in that column and does not insert it into the table.
LOAD DATA INFILE /tmp/City.txt' INTO TABLE City ( @skip, @Name, CountryCode, @District, Population) SET name=CONCAT(@Name,' ',@District);
The statement ignores the values of variables that are not used in a SET expression.
When you add new rows to a table by using the INSERT or REPLACE statements, you can control how the statement handles rows that contain duplicate keys already present in the table. You can allow the statement to raise an error, you can use the IGNORE clause to discard the row, or you can use the ON DUPLICATE KEY UPDATE clause to modify the existing row.
LOAD DATA INFILE provides the same degree of control over duplicate rows through the use of two modifier keywords, IGNORE and REPLACE. However, its duplicate-handling behavior differs slightly depending on whether the data file is on the server host or on the client host, so you must consider the data file location when using LOAD DATA INFILE.
Loading a File from the Server Host
When loading a file that is located on the server host, LOAD DATA INFILE handles rows that contain duplicate unique keys as follows:
- By default, an input record that causes a duplicate-key violation results in an error; the rest of the data file is not loaded. Records processed up to that point are loaded into the table.
- If you provide the IGNORE keyword after the file name, new records that cause duplicate-key violations are ignored, and the statement does not raise an error. LOAD DATA INFILEprocesses the entire file loads all records not containing duplicate keys, and discards the rest.
- If you provide the REPLACE keyword after the file name, new records that cause duplicate-key violations replace any records already in the table that contain the duplicated key values. LOAD DATA INFILE processes the entire file and loads all its records into the table.
Loading a File from the Client Host
When you load a file from the client host, LOAD DATA INFILE ignores records that contain duplicate keys by default. That is, the default behavior is the same as if you specify the IGNOREoption. The reason for this is that the client/server protocol does not allow interrupting the transfer of the data file from client host to the server after the transfer has started, so there is no convenient way to abort the operation in the middle.