IBMMQ Chinease ISSUE

最新推荐文章于 2023-12-28 19:30:00 发布

原创最新推荐文章于 2023-12-28 19:30:00 发布 · 2.1k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#character #websphere #encoding #jms #server #application

本文探讨了通过IBMMQ传输中文消息时出现乱码的原因，并提出了使用BytesMessage结合UTF-8编码来有效避免乱码问题的方法。

最近遇到一个问题，我们从一个IBM MQ 中收到的中文消息出现了乱码。

这是发送的消息：
    <message>繁体中文</message>
当然，我们希望收到的消息跟发送消息一模一样。不过实际收到的消息却是：
    <message>?????</message>

问题出在哪里？要知道，一条消息从发送到接收经历了发送端 --> MQ Server，MQ Server --> 接收端的过程。任何一个环节的字符编码出了问题都有可能出现乱码。

凡涉及中文乱码的问题，解起来都是蛮头痛的。头痛的这里就不提了，记一下查到的资料还有一些心得，或许以后还有用:)

CCSID在Client端的设定
从这里（http: //middleware.its.state.nc.us/middleware/Documentation/en_US/htm/amqaac04/amqaac0418.htm）可以看到MQ Server上面可以支援的编码方式及对应的CCSID
以下是一些常用的CCSID
819        ISO8859－1(Default)
1208       UTF-8
950          BIG5
1386        GBK

MQ Client在跟MQ Server连接的时候需要提供CCSID作为连接参数，推荐这个参数跟Server上的设定一致，不过并非要一致才可连接。如果不一致，则会增加MQ Server转码的额外负担。另外特定的字符集间也有可能无法互相转换，传送消息时便会报错。

Message Type 的选择策略
IBM MQ Online Help( http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/index.jsp) 上面有列出提供支援的消息形态（其实也就是JMS定义的几种消息形态）。其中比较常用的可能也就是TextMessage 跟BytesMessage。以下就是资料中对这两种Message Type的经典说明。

TextMessage

is an encoded string. For an outgoing message, the string is encoded in the character set given by the destination object. This defaults to UTF8 encoding (the UTF8 encoding starts with the first character of the message; there is no length field at the start). It is, however, possible to specify any other character set supported by WebSphere MQ Java. Such character sets are used mainly when you send a message to a non-JMS application.

If the character set is a double-byte set (including UTF16), the destination object's integer encoding specification determines the order of the bytes.

An incoming message is interpreted using the character set and encoding that are specified in the message itself. These specifications are in the last WebSphere^® MQ header (or MQMD if there are no headers). For JMS messages, the last header is usually the MQRFH2.

BytesMessage

is, by default, a sequence of bytes as defined by the JMS 1.0.2 specification and associated Java documentation.

For an outgoing message that was assembled by the application itself, the destination object's encoding property can be used to override the encodings of integer and floating point fields contained in the message. For example, you can request that floating point values are stored in S/390^® rather than IEEE format).

An incoming message is interpreted using the numeric encoding specified in the message itself. This specification is in the rightmost WebSphere MQ header (or MQMD if there are no headers). For JMS messages, the rightmost header is usually the MQRFH2.

If a BytesMessage is received, and is re-sent without modification, its body is transmitted byte for byte, as it was received. The destination object's encoding property has no effect on the body. The only string-like entity that can be sent explicitly in a BytesMessage is a UTF8 string. This is encoded in Java UTF8 format, and starts with a 2-byte length field. The destination object's character set property has no effect on the encoding of an outgoing BytesMessage. The character set value in an incoming WebSphere MQ message has no effect on the interpretation of that message as a JMS BytesMessage.

Non-Java applications are unlikely to recognize the Java UTF8 encoding. Therefore, for a JMS application to send a BytesMessage that contains text data, the application itself must convert its strings to byte arrays, and write these byte arrays into the BytesMessage.

其实要用MQ来传递中文消息，不管这个MQ是哪方提供的，那么使用BytesMessage可以比较容易的避免乱码的问题。作法上推荐在发送端先用UTF-8对消息编码，再放入BytesMessage。因为是字节形态，所以在传送过程中不会对其重新编码，就避免了可能出现的乱码。在接收方收到消息后只要以UTF-8解码，就能得到正确的消息内容了。
使用TextMessage来传递中文消息要比使用BytesMessage的方式更容易出现乱码。如果一定要用，则需注意对于消息对象发送时的处理，除了要使字符串的编码方式跟MQ Server上的设定一致以外，可能还需要特别在消息头中设定该消息的字符集信息。前者是为了让MQ Server在处理消息的时候正确辨识消息的内容，后者则是为了让消息的接收方在收到消息后可以依照消息头中设定的字符集信息对其内容正确的解码。

--END