提升MySQL插入数据速度的策略-优快云博客

本文链接：https://blog.youkuaiyun.com/yonghengzhimi/article/details/108512316

提高MySQL插入速度可以通过多种方式实现，如批量插入、使用INSERT DELAYED、LOAD DATA INFILE、禁用索引再重建、锁定表等。LOAD DATA INFILE通常比大量单独的INSERT语句快20倍。在大量插入数据时，可以先删除索引，插入数据后再重建，以减少插入索引的时间。此外，增大key_buffer_size也可以加速MyISAM表的插入。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

mysql5的手册中提到，插入一条记录，所需的时间比例大概是：

连接：(3)
发送查询给服务器：(2)
分析查询：(2)
插入记录：（1x记录大小）
插入索引：（1x索引）
关闭：(1)

并且表的大小以logN（B树）的速度减慢索引的插入，因此提高插入速度的方法大概有以下7种：

1.一个insert语句包含多个value值；

2.使用insert delayed方法；

3.使用insert into ...values(select ...from)，即select的同时执行insert；

4.使用load data infile；

5.先禁掉索引，插入后再创建索引；

6.写锁表，插入，解锁。原因是索引缓存区仅在所有insert语句完成后才刷新到磁盘上一次；

7.增加key_buffer_size值来扩大键高速缓冲区。

关于load data infile的用法：从一个txt文件导入数据，指定LOCAL会从客户端文件系统上查找txt文件。
比如生成一个txt文件，
select * into '/home/tmp.txt' fields terminated by ',' from table_a;
导入文件，
load data infile '/home/tmp.txt' into table table_b fields terminated by ',';
语法：
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [FIELDS
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char' ]
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number LINES]
    [(col_name_or_user_var,...)]
    [SET col_name = expr,...)]

=============================================================================

在表中已有大量的index的情况下插入大量数据，可以采用先将索引删除，然后再插入数据，然后再重新建立索引。之所以这样会快是因为？？？

7.2.14 Speed of INSERT Statements

The time to insert a record is determined by the following factors, where the numbers indicate approximate proportions:

    * Connecting: (3)
    * Sending query to server: (2)
    * Parsing query: (2)
    * Inserting record: (1 x size of record)
    * Inserting indexes: (1 x number of indexes)
    * Closing: (1)

This does not take into consideration the initial overhead to open tables, which is done once for each concurrently running query.

The size of the table slows down the insertion of indexes by log N, assuming B-tree indexes.

You can use the following methods to speed up inserts:

    * If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is much faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a non-empty table, you may tune the bulk_insert_buffer_size variable to make it even faster. See section 5.2.3 Server System Variables.
    * If you are inserting a lot of rows from different clients, you can get higher speed by using the INSERT DELAYED statement. See section 14.1.4 INSERT Syntax.
    * With MyISAM tables you can insert rows at the same time that SELECT statements are running if there are no deleted rows in the tables.
    * When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using a lot of INSERT statements. See section 14.1.5 LOAD DATA INFILE Syntax.
    * With some extra work, it is possible to make LOAD DATA INFILE run even faster when the table has many indexes. Use the following procedure:
         1. Optionally create the table with CREATE TABLE.
         2. Execute a FLUSH TABLES statement or a @command{mysqladmin flush-tables} command.
         3. Use @command{myisamchk --keys-used=0 -rq /path/to/db/tbl_name.} This will remove all use of all indexes for the table.
         4. Insert data into the table with LOAD DATA INFILE. This will not update any indexes and will therefore be very fast.
         5. If you are going to only read the table in the future, use @command{myisampack} to make it smaller. See section 15.1.3.3 Compressed Table Characteristics.
         6. Re-create the indexes with @command{myisamchk -r -q /path/to/db/tbl_name}. This will create the index tree in memory before writing it to disk, which is much faster because it avoids lots of disk seeks. The resulting index tree is also perfectly balanced.
         7. Execute a FLUSH TABLES statement or a @command{mysqladmin flush-tables} command.
      Note that LOAD DATA INFILE also performs the preceding optimization if you insert into an empty MyISAM table; the main difference is that you can let @command{myisamchk} allocate much more temporary memory for the index creation than you might want the server to allocate for index re-creation when it executes the LOAD DATA INFILE statement. As of MySQL 4.0, you can also use ALTER TABLE tbl_name DISABLE KEYS instead of @command{myisamchk --keys-used=0 -rq /path/to/db/tbl_name} and ALTER TABLE tbl_name ENABLE KEYS instead of @command{myisamchk -r -q /path/to/db/tbl_name}. This way you can also skip the FLUSH TABLES steps.
    * You can speed up INSERT operations that are done with multiple statements by locking your tables:

LOCK TABLES a WRITE;
INSERT INTO a VALUES (1,23),(2,34),(4,33);
INSERT INTO a VALUES (8,26),(6,29);
UNLOCK TABLES;

A performance benefit occurs because the index buffer is flushed to disk only onc, after all INSERT statements have completed. Normally there would be as many index buffer flushes as there are different INSERT statements. Explicit locking statements are not needed if you can insert all rows with a single statement. For transactional tables, you should use BEGIN/COMMIT instead of LOCK TABLES to get a speedup. Locking also lowers the total time of multiple-connection tests, although the maximum wait time for individual connections might go up because they wait for locks. For example:

Connection 1 does 1000 inserts
Connections 2, 3, and 4 do 1 insert
Connection 5 does 1000 inserts

If you don't use locking, connections 2, 3, and 4 will finish before 1 and 5. If you use locking, connections 2, 3, and 4 probably will not finish before 1 or 5, but the total time should be about 40% faster. INSERT, UPDATE, and DELETE operations are very fast in MySQL, but you will obtain better overall performance by adding locks around everything that does more than about five inserts or updates in a row. If you do very many inserts in a row, you could do a LOCK TABLES followed by an UNLOCK TABLES once in a while (about each 1,000 rows) to allow other threads access to the table. This would still result in a nice performance gain. INSERT is still much slower for loading data than LOAD DATA INFILE, even when using the strategies just outlined.
* To get some more speed for MyISAM tables, for both LOAD DATA INFILE and INSERT, enlarge the key cache by increasing the key_buffer_size system variable. See section 7.5.2 Tuning Server Parameters.

本文来自优快云博客，转载请标明出处：http://blog.youkuaiyun.com/fangzhuster/archive/2009/09/18/4567340.aspx