perl mysql utf8,如何在Perl脚本中使用UTF8连接到MySQL?

本文详细描述了在Perl脚本中尝试将四字节Unicode字符(如U+1F61C)存储到MySQL数据库时遇到的问题。尽管数据库、表和列都设置为utf8mb4字符集,错误`Incorrect string value`仍然出现。问题在于,即使使用`mysql_enable_utf8=1`,MySQL连接仍需启用utf8mb4支持。解决方案是在Perl脚本中使用`mysql_enable_utf8mb4=1`参数来正确设置连接,以允许存储四字节Unicode字符。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

In a nutshell:

Within a Perl-Script: How do I connect to MySQL in a way that allows to transmit the four-byte unicode character U+1F61C ("😜") from the perl script to a MySQL-Table where this character should be stored?

Using {mysql_enable_utf8 => 1} doesn't solve the problem.

In detail:

I have exactly the same problem as described in the Question ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x9C' for column 'comment' at row 1 and even with the same Unicode character (😜 = U+1F61C = FACE WITH STUCK-OUT TONGUE AND WINKING EYE) which produces the error message

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column ...

But I don't use PHP, I use Perl.

The accepted answer in the other question says:

Run MySQL 5.5 or later.

I check the version:

mysql> select version();

+-------------------------+

| version() |

+-------------------------+

| 5.7.13-0ubuntu0.16.04.2 |

+-------------------------+

So it is 5.7 which is later than 5.5.

✅checked

Set table's character to utf8mb4.

I check the character set of my database, my table and even of the reported column:

mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA

-> WHERE schema_name = "myDatabase";

+----------------------------+

| default_character_set_name |

+----------------------------+

| utf8mb4 |

+----------------------------+

mysql> SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,

-> information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA

-> WHERE CCSA.collation_name = T.table_collation

-> AND T.table_schema = "myDatabase"

-> AND T.table_name = "myTable";

+--------------------+

| character_set_name |

+--------------------+

| utf8mb4 |

+--------------------+

mysql> SELECT character_set_name FROM information_schema.`COLUMNS`

-> WHERE table_schema = "myDatabase"

-> AND table_name = "myTable"

-> AND column_name = "myColumn";

+--------------------+

| character_set_name |

+--------------------+

| utf8mb4 |

+--------------------+

So my database, my table and the reported column all use the character set utf8mb4.

✅checked

Enable UTF8 on your MySQL connection.

This seems to be the problem. The answer to the other question says:

SET NAMES utf8, or use an option when connecting that similarly enables it.

I don't know how to SET NAMES utf8 within a perl script, so I did it how I did it over the last years. I think that this is "an option when connecting that similarly enables it".

It's at the end of the long line that begins with my $dbh = DBI->connect:

#!/usr/bin/perl -w

use strict;

use warnings;

use utf8;

use Encode;

use DBI;

binmode STDOUT, ":utf8";

#Here I connect using the parameter mysql_enable_utf8 (create database handle):

my $dbh = DBI->connect('DBI:mysql:database=myDatabase;host=localhost','aUser','aPassword',{mysql_enable_utf8 => 1});

#Prepare the statement (create statement handle):

my $sth = $dbh->prepare('INSERT INTO `myTable` (`myColumn`) VALUES(?);');

#This doesn't work:

$sth->execute('😜');

#This doesn't work either:

$sth->execute(encode_utf8('😜'));

#end processing:

$dbh->disconnect();

exit(0);

Both executes throw the same error (only the line number at the end changes):

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column 'myColumn' at row 1 at myTestScript.pl line 16.

What am I doing wrong?

How can I do it better?

解决方案

The problem is with the SET NAMES utf8 command. In MySQL the utf8 character set is not truly utf8, it supports characters up 3 bytes only and the character in question has 4 bytes:

The utf8 character set in MySQL has these characteristics:

• No support for supplementary characters (BMP characters only).

• A maximum of three bytes per multibyte character.

The true utf8 is the utf8mb4 that you use as character set in the field itself. So, use SET NAMES utf8mb4

So from Perl you should use {mysql_enable_utf8mb4 => 1} instead of {mysql_enable_utf8 => 1}.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值