•Write newlines at the ends of lines.
Backslash is the MySQL escape character within strings, so to write FIELDS ESCAPED BY '\\', you must specify two backslashes for the value to be interpreted as a single backslash.
Note: If you have generated the text file on a Windows system, you might have to use LINES TERMINATED BY '\r\n' to read the file properly, because Windows programs typically use two characters as a line terminator. Some programs, such as WordPad, might use \r as a line terminator when writing files. To read such files, use LINES TERMINATED BY '\r'.
If all the lines you want to read in have a common prefix that you want to ignore, you can use LINES STARTING BY 'prefix_string' to skip over the prefix, and anything before it. If a line does not include the prefix, the entire line is skipped. Suppose that you issue the following statement:
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test FIELDS TERMINATED BY ',' LINES STARTING BY 'xxx'; If the data file looks like this:
xxx"abc",1 something xxx"def",2 "ghi",3 The resulting rows will be ("abc",1) and ("def",2). The third row in the file is skipped because it does not contain the prefix.
The IGNORE number LINES option can be used to ignore lines at the start of the file. For example, you can use IGNORE 1 LINES to skip over an initial header line containing column names:
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES; When you use SELECT ... INTO OUTFILE in tandem with LOAD DATA INFILE to write data from a database into a file and then read the file back into the database later, the field- and line-handling options for both statements must match. Otherwise, LOAD DATA INFILE will not interpret the contents of the file properly. Suppose that you use SELECT ... INTO OUTFILE to write a file with fields delimited by commas:
SELECT * INTO OUTFILE 'data.txt' FIELDS TERMINATED BY ',' FROM table2; To read the comma-delimited file back in, the correct statement would be:
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY ','; If instead you tried to read in the file with the statement shown following, it wouldn't work because it instructs LOAD DATA INFILE to look for tabs between fields:
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY '\t'; The likely result is that each input line would be interpreted as a single field.
LOAD DATA INFILE can be used to read files obtained from external sources. For example, many programs can export data in comma-separate values (CSV) format, such that lines have fields separated by commas and enclosed within double quotes. If lines in such a file are terminated by newlines, the statement shown here illustrates the field- and line-handling options you would use to load the file:
LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'; Any of the field- or line-handling options can specify an empty string (''). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY and FIELDS ESCAPED BY values must be a single character. The FIELDS TERMINATED BY, LINES STARTING BY, and LINES TERMINATED BY values can be more than one character. For example, to write lines that are terminated by carriage return/linefeed pairs, or to read a file containing such lines, specify a LINES TERMINATED BY '\r\n' clause.
To read a file containing jokes that are separated by lines consisting of %%, you can do this
CREATE TABLE jokes (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, joke TEXT NOT NULL); LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes FIELDS TERMINATED BY '' LINES TERMINATED BY '\n%%\n' (joke); FIELDS [OPTIONALLY] ENCLOSED BY controls quoting of fields. For output (SELECT ... INTO OUTFILE), if you omit the word OPTIONALLY, all fields are enclosed by the ENCLOSED BY character. An example of such output (using a comma as the field delimiter) is shown here:
"1","a string","100.20" "2","a string containing a , comma","102.20" "3","a string containing a \" quote","102.20" "4","a string containing a \", quote and comma","102.20" If you specify OPTIONALLY, the ENCLOSED BY character is used only to enclose values from columns that have a string data type (such as CHAR, BINARY, TEXT, or ENUM):
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a \" quote",102.20 4,"a string containing a \", quote and comma",102.20 Note that occurrences of the ENCLOSED BY character within a field value are escaped by prefixing them with the ESCAPED BY character. Also note that if you specify an empty ESCAPED BY value, it is possible to inadvertently generate output that cannot be read properly by LOAD DATA INFILE. For example, the preceding output just shown would appear as follows if the escape character is empty. Observe that the second field in the fourth line contains a comma following the quote, which (erroneously) appears to terminate the field:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20 For input, the ENCLOSED BY character, if present, is stripped from the ends of field values. (This is true regardless of whether OPTIONALLY is specified; OPTIONALLY has no effect on input interpretation.) Occurrences of the ENCLOSED BY character preceded by the ESCAPED BY character are interpreted as part of the current field value.
If the field begins with the ENCLOSED BY character, instances of that character are recognized as terminating a field value only if followed by the field or line TERMINATED BY sequence. To avoid ambiguity, occurrences of the ENCLOSED BY character within a field value can be doubled and are interpreted as a single instance of the character. For example, if ENCLOSED BY '"' is specified, quotes are handled as shown here:
"The ""BIG"" boss" -> The "BIG" boss The "BIG" boss -> The "BIG" boss The ""BIG"" boss -> The ""BIG"" boss FIELDS ESCAPED BY controls how to write or read special characters. If the FIELDS ESCAPED BY character is not empty, it is used to prefix the following characters on output:
•The FIELDS ESCAPED BY character
•The FIELDS [OPTIONALLY] ENCLOSED BY character
•The first character of the FIELDS TERMINATED BY and LINES TERMINATED BY values
•ASCII 0 (what is actually written following the escape character is ASCII ‘0’, not a zero-valued byte)
If the FIELDS ESCAPED BY character is empty, no characters are escaped and NULL is output as NULL, not \N. It is probably not a good idea to specify an empty escape character, particularly if field values in your data contain any of the characters in the list just given.
For input, if the FIELDS ESCAPED BY character is not empty, occurrences of that character are stripped and the following character is taken literally as part of a field value. The exceptions are an escaped ‘0’ or ‘N’ (for example, \0 or \N if the escape character is ‘\’). These sequences are interpreted as ASCII NUL (a zero-valued byte) and NULL. The rules for NULL handling are described later in this section.
For more information about ‘\’-escape syntax, see Section 9.1, “Literal Values”.
In certain cases, field- and line-handling options interact:
•If LINES TERMINATED BY is an empty string and FIELDS TERMINATED BY is non-empty, lines are also terminated with FIELDS TERMINATED BY.
•If the FIELDS TERMINATED BY and FIELDS ENCLOSED BY values are both empty (''), a fixed-row (non-delimited) format is used. With fixed-row format, no delimiters are used between fields (but you can still have a line terminator). Instead, column values are read and written using a field width wide enough to hold all values in the field. For TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT, the field widths are 4, 6, 8, 11, and 20, respectively, no matter what the declared display width is.