code unit和code point

本文详细解析了Java中Unicode与UTF-16编码的关系,解释了如何使用Java API处理Unicode字符,包括如何计算真正的字符数以及定位特定字符的位置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://www.blogjava.net/default.aspx?id=-10&cateid=3891

一个完整的Unicode字符叫代码点/CodePoint,而一个Java char 叫代码单元code unit;
string对象以UTF-16保存Unicode字符,需要用2个字符表示一个超大字符集汉字,这种表示方式为
Sruuogate,第一个字符叫Surrogate High,第二个就是Surrogate Low
判断一个char是否是Surrogate区的字符,用Character的isHighSurrogate()/isLowSurrogate()方法。
从两个Surrogate High/Low字符,返回一个完整的Unicode CodePoint用Character.toCodePoint()/codePointAt()
一个Code Point,可能需要一个也可能需要两个char表示,因此不能直接使用CharSequence.length()
方法返回一个字符串到底有多少个汉字,而需要用String.codePointCount()/Character.codePointCount()
要定位字符串中的第N个字符,不能直接将n作为偏移量,而需要从字符串头部依次遍历得到,需要
String.offsetByCodePoints()
从字符串的当前字符,找到上一个字符,不能直接用offset实现,而需要
String.codePointBefore(),或String.offsetByCodePoints()
从当前字符,找下一个字符,需要判断当前CodePoint的长度,再计算得到
String.offsetByCodePoints()

看了之后还是不太明白

String str("web开发");

int len = str.length();

System.out.println();

输出结果为5;

改为用int len = str.codePointCount(0,str.length());

输出结果同样为5

定位一个字符

用char cp = str.charAt(4);

输出为“发”;

若用int index = str.offsetByCodePoints(0,4);

       int cp = str.codePointAt(index);

输出为21457;什么意思??原来是获取字符串中指定位置的字符的UNICODE值,其值等同于(int)charAt(i)

jdk1.5中,string类新方法介绍:

http://www.javayou.com/html/diary/showlog.vm?sid=2&log_id=557

 
/usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Drawable case class Point(var x:Double,var y:Double) extends Drawable{ ^ defined trait Drawable /usr/local/scala/mycode/exercise2-2.scala:12: error: not found: type Point abstract class Shape(var point:Point) extends Drawable{ ^ /usr/local/scala/mycode/exercise2-2.scala:13: error: not found: type Point def moveTo(epoint:Point){ ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Shape class Line(var point1:Point,var point2:Point) extends Shape(point1){ ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Point class Line(var point1:Point,var point2:Point) extends Shape(point1){ ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Point class Line(var point1:Point,var point2:Point) extends Shape(point1){ ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: no arguments allowed for nullary constructor Object: ()Object class Line(var point1:Point,var point2:Point) extends Shape(point1){ ^ /usr/local/scala/mycode/exercise2-2.scala:12: error: not found: type Point override def moveTo(point:Point){ ^ /usr/local/scala/mycode/exercise2-2.scala:25: error: not found: value Point point1 = Point(newPoint1X,newPoint1Y) ^ /usr/local/scala/mycode/exercise2-2.scala:26: error: not found: value Point point2 = Point(newPoint2X,newPoint2Y) ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Shape class Circle(var point:Point,var r:Double) extends Shape(point){ ^ /usr/local/scala/mycode/exercise2-2.scala:11: error: not found: type Point cla这是运行结果 帮我查看一下哪里出错
03-31
create definer = fmmp@`%` procedure P_QUERY_QR_CODE_VALIDATE(IN IN_QR_CODE varchar(50), OUT POINT_ID bigint, OUT AGENCY_ID bigint, OUT CODE varchar(50), OUT ORESULT bigint, OUT OPRO_DESC varchar(200)) LABEL_PRO:BEGIN#验证二维码的使用情况 DECLARE IS_CHECK_CODE INT; DECLARE IS_CERTIFIED_CODE INT; DECLARE IS_QR_DEVICE_CODE INT; DECLARE IS_BIND INT; DECLARE IS_BIND_BY_CHECK_POINT INT; DECLARE IS_BIND_BY_AGENCY INT; DECLARE V_UNIT_TYPE VARCHAR(50); DECLARE V_DTU_NO VARCHAR(50); DECLARE V_PROPERTY INT; SET ORESULT = 0; SET OPRO_DESC = ''; SELECT COUNT(TQC.QR_CODE_ID) INTO IS_CHECK_CODE FROM T_QR_CODE TQC WHERE TQC.ISDELETED = 0 AND TQC.QR_CODE = IN_QR_CODE; IF NOT ((IN_QR_CODE REGEXP 'ZC[A-F0-9]{14}') OR (IN_QR_CODE REGEXP '^ZC[A-F0-9]{4}LR[A-F0-9]{8}$') OR (IN_QR_CODE REGEXP '^ZC[A-F0-9]{4}HR[A-F0-9]{16}$') OR (IN_QR_CODE REGEXP '^[A-F0-9]{12}$') OR (IN_QR_CODE REGEXP '^0{8}[0-9]{8}$') OR (IN_QR_CODE REGEXP '^ZCC[A-F0-9]{13}$') OR (IN_QR_CODE REGEXP '^RSV[A-F0-9]{13}$') OR (IN_QR_CODE REGEXP '^013E7[A-F0-9]{10}01$')) AND IS_CHECK_CODE = 0 THEN SET ORESULT = -1; SET OPRO_DESC = '不支持该标签'; LEAVE LABEL_PRO; END IF; SELECT COUNT(TCQC.CERTIFIED_QR_CODE) INTO IS_CERTIFIED_CODE FROM T_CERTIFIED_QR_CODE TCQC WHERE TCQC.ISDELETED = 0 AND TCQC.CERTIFIED_QR_CODE = IN_QR_CODE; SELECT COUNT(*) INTO IS_QR_DEVICE_CODE FROM T_QR_DEVICE WHERE DEVICE_NO = IN_QR_CODE; IF IS_CHECK_CODE > 0 THEN #消防专用二维码 SELECT COUNT(CASE WHEN T.CHECK_OUT = 0 THEN NULL ELSE 1 END), COUNT(CASE WHEN EXISTS (SELECT 1 FROM T_CHECK_POINTS WHERE TAG_IDENT_CODE = T.QR_CODE AND ISDELETED = 0) THEN 1 ELSE NULL END), COUNT(CASE WHEN EXISTS (SELECT 1 FROM T_AGENCY_IDENT_CODE TAIC INNER JOIN T_AGENCY TA ON TA.AGENCY_ID = TAIC.AGENCY_ID AND TA.ISDELETED = 0 WHERE TAIC.TAG_IDENT_CODE = T.QR_CODE ) THEN 1 ELSE NULL END) INTO IS_BIND, IS_BIND_BY_CHECK_POINT, IS_BIND_BY_AGENCY FROM T_QR_CODE T WHERE T.QR_CODE = IN_QR_CODE AND ISDELETED = 0; IF IS_BIND = 0 THEN SET ORESULT = 100; SET OPRO_DESC = '该标签未被绑定'; ELSE IF IS_BIND_BY_CHECK_POINT = 1 THEN SELECT MAX(CHECK_POINTS_ID), MAX(AGENCY_ID) INTO POINT_ID, AGENCY_ID FROM T_CHECK_POINTS WHERE TAG_IDENT_CODE = IN_QR_CODE AND ISDELETED = 0; SET ORESULT = 201; SET OPRO_DESC = '标签已被巡查点绑定'; ELSEIF IS_BIND_BY_AGENCY = 1 THEN SELECT MAX(TA.AGENCY_ID) INTO AGENCY_ID FROM T_AGENCY_IDENT_CODE TAIC INNER JOIN T_AGENCY TA ON TA.AGENCY_ID = TAIC.AGENCY_ID AND TA.ISDELETED = 0 WHERE TAIC.TAG_IDENT_CODE = IN_QR_CODE AND TAIC.TAG_TYPE = 2; SET ORESULT = 202; SET OPRO_DESC = '标签已被小场所绑定'; ELSE SET ORESULT = 100; SET OPRO_DESC = '该标签未被绑定'; END IF; END IF; ELSEIF IS_QR_DEVICE_CODE > 0 THEN #物联网设备二维码 SELECT TQD.DEVICE_NO, TUT.UNIT_TYPE_ID INTO CODE, V_UNIT_TYPE FROM T_QR_DEVICE TQD LEFT JOIN T_UNIT_TYPE TUT ON TUT.DEVICE_TYPE_ID = TQD.DEVICE_TYPE_ID AND TUT.ISVALID = 1 WHERE TQD.DEVICE_NO = IN_QR_CODE; SELECT COUNT(TU.UNIT_ID), MAX(TU.UNIT_ID), MAX(TU.AGENCY_ID) INTO IS_BIND, POINT_ID, AGENCY_ID FROM T_UNIT TU WHERE TU.ISDELETED = 0 AND TU.USER_CODE = CODE; IF IS_BIND = 1 THEN SET ORESULT = 203; SET OPRO_DESC = '二维码被设备绑定'; ELSE #要返回设备类型设备编号 SET POINT_ID = V_UNIT_TYPE; SET ORESULT = 101; SET OPRO_DESC = '设备二维码未被绑定'; END IF; ELSEIF IS_CERTIFIED_CODE > 0 AND ((IN_QR_CODE REGEXP '^ZCC[A-F0-9]{13}$') OR (IN_QR_CODE REGEXP '^RSV[A-F0-9]{13}$')) THEN #摄像头设备二维码 SET CODE = IN_QR_CODE; SET V_UNIT_TYPE = 2130; SELECT COUNT(TU.UNIT_ID), MAX(TU.UNIT_ID), MAX(TU.AGENCY_ID) INTO IS_BIND, POINT_ID, AGENCY_ID FROM T_UNIT TU WHERE TU.ISDELETED = 0 AND TU.USER_CODE = CODE; IF IS_BIND = 1 THEN SET ORESULT = 203; SET OPRO_DESC = '二维码被设备绑定'; ELSE #要返回设备类型设备编号 SET POINT_ID = V_UNIT_TYPE; SET ORESULT = 101; SET OPRO_DESC = '设备二维码未被绑定'; END IF; ELSEIF IS_CERTIFIED_CODE > 0 THEN #从设备认证二维码 SELECT COUNT(CASE WHEN T.CHECK_OUT = 0 THEN NULL ELSE 1 END), MAX(TU.UNIT_ID), MAX(TU.AGENCY_ID), MAX(DTU_NO), MAX(TUT.PROPERTY) INTO IS_BIND, POINT_ID, AGENCY_ID, V_DTU_NO, V_PROPERTY FROM T_CERTIFIED_QR_CODE T LEFT JOIN T_UNIT TU ON T.CERTIFIED_QR_CODE = TU.CERTIFICATE_CODE AND TU.ISDELETED = 0 LEFT JOIN T_UNIT_TYPE TUT ON TUT.UNIT_TYPE_ID = TU.UNIT_TYPE_ID WHERE T.ISDELETED = 0 AND T.CERTIFIED_QR_CODE = IN_QR_CODE; IF IS_BIND > 0 AND POINT_ID IS NOT NULL THEN IF V_PROPERTY = 21 THEN SET ORESULT = 205; SET OPRO_DESC = '认证二维码已被消防栓使用'; ELSE SET ORESULT = 204; SET OPRO_DESC = '认证二维码已被使用'; #要返回设备编号 SET CODE = V_DTU_NO; END IF; ELSE SET ORESULT = 102; SET OPRO_DESC = '从设备认证二维码未被绑定'; #要返回设备类型设备编号 SET CODE = IN_QR_CODE; SET POINT_ID = CONV(SUBSTR(IN_QR_CODE, 3, 4), 16, 10); END IF; ELSE SET ORESULT = -1; SET OPRO_DESC = '不支持该标签'; END IF; END; 运行上面的函数出现 [HY000][1267] Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='错误,帮我看看什么原因
最新发布
06-04
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值