Java Mysql latin1 -- cp1252 West European latin1_swedish_ci 转中文

真TMD操蛋,一个编码问题折磨一周时间,终于找到解决办法,

原文转自:http://www.2cto.com/kf/201202/120875.html

解决如下:

Mysql 的latin1 不等于标准的latin1(iso-8859-1) 和cp1252,比iso-8859-1多了0x80-0x9f字符,比cp1252多了0x81,0x8d,0x8f,0x90,0x9d 一共5个字符。

http://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html

latin1 is the default character set. MySQL's latin1 is the same as the Windows cp1252 character set. This means it is the same as the official ISO 8859-1 or IANA (Internet Assigned Numbers Authority) latin1, except that IANA latin1 treats the code points between 0x80 and 0x9f as “undefined,” whereas cp1252, and therefore MySQL's latin1, assign characters for those positions. For example, 0x80 is the Euro sign. For the “undefined” entries in cp1252, MySQL translates 0x81 to Unicode 0x0081, 0x8d to 0x008d, 0x8f to 0x008f, 0x90 to 0x0090, and 0x9d to 0x009d.

 

这样在Java中,如果使用标准的iso-8859-1或者cp1252解码可能出现乱码。
s.getBytes("iso-8859-1") 或者 s.getBytes("cp1252");

写了一段代码来解决这个问题

private String convertCharset(String s){
        if(s!=null){
             try {
                 int length = s.length();
                 byte[] buffer = new byte[length];
                 //0x81 to Unicode 0x0081, 0x8d to 0x008d, 0x8f to 0x008f, 0x90 to 0x0090, and 0x9d to 0x009d.
                 for(int i=0;i<length;++i){
                     char c = s.charAt(i);
                     if(c==0x0081){
                         buffer[i]=(byte)0x81;
                     }
                     else if(c==0x008d){
                         buffer[i]=(byte)0x8d;
                     }
                     else if(c==0x008f){
                         buffer[i]=(byte)0x8f;
                     }
                     else if(c==0x0090){
                         buffer[i]=(byte)0x90;
                     }
                     else if(c==0x009d){
                         buffer[i]=(byte)0x9d;
                     }
                     else{
                         buffer[i] = Character.toString(c).getBytes("cp1252")[0];
                     }
                 }
                 String result = new String(buffer,"utf-8");
                 return result;
             } catch (UnsupportedEncodingException e) {
                 logger.error("charset convert error", e);
             }
         }
         return null;
     }


 

摘自 小明思考

 

[SQL] Query information_schema start [ERR] 1044 - Access denied for user 'root'@'%' to database 'information_schema' [ERR] /* Navicat Premium Data Transfer Source Server : localhost_3306 Source Server Type : MySQL Source Server Version : 50648 Source Host : localhost:3306 Source Schema : information_schema Target Server Type : MySQL Target Server Version : 50648 File Encoding : 65001 Date: 31/12/2020 11:49:28 */ SET NAMES utf8mb4; SET FOREIGN_KEY_CHECKS = 0; -- ---------------------------- -- Table structure for CHARACTER_SETS -- ---------------------------- DROP TABLE IF EXISTS `CHARACTER_SETS`; CREATE TABLE `CHARACTER_SETS` ( `CHARACTER_SET_NAME` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '', `DEFAULT_COLLATE_NAME` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '', `DESCRIPTION` varchar(60) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '', `MAXLEN` bigint(3) NOT NULL DEFAULT 0 ) ENGINE = MEMORY CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Fixed; -- ---------------------------- -- Records of CHARACTER_SETS -- ---------------------------- INSERT INTO `CHARACTER_SETS` VALUES ('big5', 'big5_chinese_ci', 'Big5 Traditional Chinese', 2); INSERT INTO `CHARACTER_SETS` VALUES ('dec8', 'dec8_swedish_ci', 'DEC West European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('cp850', 'cp850_general_ci', 'DOS West European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('hp8', 'hp8_english_ci', 'HP West European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('koi8r', 'koi8r_general_ci', 'KOI8-R Relcom Russian', 1); INSERT INTO `CHARACTER_SETS` VALUES ('latin1', 'latin1_swedish_ci', 'cp1252 West European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('latin2', 'latin2_general_ci', 'ISO 8859-2 Central European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('swe7', 'swe7_swedish_ci', '7bit Swedish', 1); INSERT INTO `CHARACTER_SETS` VALUES ('ascii', 'ascii_general_ci', 'US ASCII', 1); INSERT INTO `CHARACTER_SETS` VALUES ('ujis', 'ujis_japanese_ci', 'EUC-JP Japanese', 3); INSERT INTO `CHARACTER_SETS` VALUES ('sjis', 'sjis_japanese_ci', 'Shift-JIS Japanese', 2); INSERT INTO `CHARACTER_SETS` VALUES ('hebrew', 'hebrew_general_ci', 'ISO 8859-8 Hebrew', 1); INSERT INTO `CHARACTER_SETS` VALUES ('tis620', 'tis620_thai_ci', 'TIS620 Thai', 1); INSERT INTO `CHARACTER_SETS` VALUES ('euckr', 'euckr_korean_ci', 'EUC-KR Korean', 2); INSERT INTO `CHARACTER_SETS` VALUES ('koi8u', 'koi8u_general_ci', 'KOI8-U Ukrainian', 1); INSERT INTO `CHARACTER_SETS` VALUES ('gb2312', 'gb2312_chinese_ci', 'GB2312 Simplified Chinese', 2); INSERT INTO `CHARACTER_SETS` VALUES ('greek', 'greek_general_ci', 'ISO 8859-7 Greek', 1); INSERT INTO `CHARACTER_SETS` VALUES ('cp1250', 'cp1250_general_ci', 'Windows Central European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('gbk', 'gbk_chinese_ci', 'GBK Simplified Chinese', 2); INSERT INTO `CHARACTER_SETS` VALUES ('latin5', 'latin5_turkish_ci', 'ISO 8859-9 Turkish', 1); INSERT INTO `CHARACTER_SETS` VALUES ('armscii8', 'armscii8_general_ci', 'ARMSCII-8 Armenian', 1); INSERT INTO `CHARACTER_SETS` VALUES ('utf8', 'utf8_general_ci', 'UTF-8 Unicode', 3); INSERT INTO `CHARACTER_SETS` VALUES ('ucs2', 'ucs2_general_ci', 'UCS-2 Unicode', 2); INSERT INTO `CHARACTER_SETS` VALUES ('cp866', 'cp866_general_ci', 'DOS Russian', 1); INSERT INTO `CHARACTER_SETS` VALUES ('keybcs2', 'keybcs2_general_ci', 'DOS Kamenicky Czech-Slovak', 1); INSERT INTO `CHARACTER_SETS` VALUES ('macce', 'macce_general_ci', 'Mac Central European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('macroman', 'macroman_general_ci', 'Mac West European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('cp852', 'cp852_general_ci', 'DOS Central European', 1); INSERT INTO `CHARACTER_SETS` VALUES ('latin7', 'latin7_general_ci', 'ISO 8859-13 Baltic', 1); INSERT INTO `CHARACTER_SETS` VALUES ('utf8mb4', 'utf8mb4_general_ci', 'UTF-8 Unicode', 4); INSERT INTO `CHARACTER_SETS` VALUES ('cp1251', 'cp1251_general_ci', 'Windows Cyrillic', 1); INSERT INTO `CHARACTER_SETS` VALUES ('utf16', 'utf16_general_ci', 'UTF-16 Unicode', 4); INSERT [SQL] Process terminated 恢复数据库出现这样的错误
05-28
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值