mysql infile ignore,使用MySQL中的LOAD DATA INFILE过滤空或空输入

本文介绍了一种在MySQL中加载大型文件时忽略特定格式记录的方法。通过使用grep或awk进行预处理,配合LOAD DATA INFILE命令,实现仅加载包含特定格式的数据行。此外,还提供了一个使用触发器来过滤空白字段的替代方案。

I have some very large files (millions of records) that I need to load into a database. They are of the form:

word1\tblahblahblah

word2\tblahblah

word3\tblahblah

word4

word5\tblahblah

...

My problem, is that I want to ignore the lines that have no second record (the 'blahblah's'), like word4.

I'm currently using the following query to load the file:

LOAD DATA LOCAL INFILE 'file'

IGNORE INTO TABLE tablename

COLUMNS TERMINATED BY '\t'

LINES TERMINATED BY '\n'

(col1, col2);

This has the functionality I want, except that it still accepts the null values. Is there a way to skip the word4 type lines and not add them to the DB?

Intuitively, I thought using a WHEN or WHERE clause would be perfect, but looking through documentation and other info online, I can't find any examples where this would work. Am I forced to filter these records out before hand, or can I do it within this query?

Any help is grealy appreciated!

解决方案

I would do this by filtering the file with grep or awk and then piping it into MySQL (via /dev/stdin). Something like this:

cat your_file.txt |

awk '/\t.+/' |

mysql -u your_username -pyour_password \

-e "LOAD DATA LOCAL INFILE '/dev/stdin' \

IGNORE INTO TABLE tablename \

COLUMNS TERMINATED BY '\t' \

LINES TERMINATED BY '\n' \

(col1, col2);" \

your_database_name

The regular expression given to awk on the second line just matches any line that has a tab character followed by one or more of any character. You might want to tweak it to fit your needs.

Edit: One other possibility occurred to me. You could use SET to set some magic value on columns that are blank and put a BEFORE INSERT trigger on the table that will bail on a row when it sees that value. I don't have much experience with triggers but I think something like this should work:

CREATE TRIGGER skip_magic_rows

BEFORE INSERT ON tablename

FOR EACH ROW

BEGIN

IF NEW.col2 = 'IDSPISPOPD4815162342' THEN # Some unlikely magic string

# Trigger an error, which will cause the INSERT to fail†

# If you have MySQL < 5.5 this is kludgy -- see Note 1

DROP TABLE `Skipped row`

# OR

# In MySQL >= 5.5 you can send a signal--'45000' is a generic error

SIGNAL SQLSTATE '45000' SET message_text = 'Skipped row'; # See Note 2

END IF

END

;

An error during either a BEFORE or AFTER trigger results in failure of the entire statement that caused trigger invocation.

Then...

LOAD DATA LOCAL INFILE 'file'

IGNORE INTO TABLE tablename

COLUMNS TERMINATED BY '\t'

LINES TERMINATED BY '\n'

(col1, @var2)

SET col2 = IF(@var2 IN (NULL, ''), 'IDSPISPOPD4815162342', @var2)

;

Hope it's helpful!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值