目录
上一篇我们介绍到,服务端在调用处理层时,会调用处理层中Processor的process方法,该方法的请求参数中包含TProtocol对象,同时,客户端的client对象在构造方法中也需要传入TProtocol参数,在处理层中,客户端写入rpc接口的请求参数、获取返回结果、服务端获取请求参数、返回方法结果,最终都交给了TProtocol来进行。之前我们介绍过,thrift的协议层负责的任务就是根据用户指定的协议,将请求参数和返回结果进行序列化/反序列化,实现不同系统之间无障碍识别数据的目的。本篇我们来深入TProtocol的源码,看下协议层的机制是怎么样的。
1.thrift的消息结构
在了解协议层之前,我们要先了解thrift作为rpc框架,对传输的数据定义的结构是怎么样的。
在thrift中,传输的数据可以分为4个类型:
- 控制消息,TMessage类,主要包含被调用的方法名,消息类型和序列号
- 请求参数,即在处理层生成的xx_args
- 返回结果,即在处理层生成的xx_result
- 异常,TApplicationException类,包含异常类型和异常信息
1.1 控制消息
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public final class TMessage {
public final String name;// 消息名称,绝大情况下是调用的rpc接口的方法名
public final byte type;// 消息类型
public final int seqid;// 序列号
public TMessage() {
this("", (byte)0, 0);
}
public TMessage(String n, byte t, int s) {
this.name = n;
this.type = t;
this.seqid = s;
}
public String toString() {
return "<TMessage name:'" + this.name + "' type: " + this.type + " seqid:" + this.seqid + ">";
}
public boolean equals(Object other) {
return other instanceof TMessage ? this.equals((TMessage)other) : false;
}
public boolean equals(TMessage other) {
return this.name.equals(other.name) && this.type == other.type && this.seqid == other.seqid;
}
}
1.2 数据消息
thrift中要传输的数据都是在idl中定义好的,thrift将要传输的数据按照idl中的类型定义,转换成java语言中的类型,从而转换成数据消息,除了基本类型之外,idl还支持struct、map和list三种特殊类型,thrift提供了以下4个类:
TStruct:结构体
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public final class TStruct {
public final String name;
public TStruct() {
this("");
}
public TStruct(String n) {
this.name = n;
}
}
TField:结构体属性
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.thrift.protocol;
/**
* Helper class that encapsulates field metadata.
*
*/
public class TField {
public TField() {
this("", TType.STOP, (short)0);
}
public TField(String n, byte t, short i) {
name = n;
type = t;
id = i;
}
public final String name;
public final byte type;
public final short id;
public String toString() {
return "<TField name:'" + name + "' type:" + type + " field-id:" + id + ">";
}
public boolean equals(TField otherField) {
return type == otherField.type && id == otherField.id;
}
}
TList:list
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.thrift.protocol;
/**
* Helper class that encapsulates list metadata.
*
*/
public final class TList {
public TList() {
this(TType.STOP, 0);
}
public TList(byte t, int s) {
elemType = t;
size = s;
}
public final byte elemType;
public final int size;
}
TMap:map
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.thrift.protocol;
/**
* Helper class that encapsulates map metadata.
*
*/
public final class TMap {
public TMap() {
this(TType.STOP, TType.STOP, 0);
}
public TMap(byte k, byte v, int s) {
keyType = k;
valueType = v;
size = s;
}
public final byte keyType;
public final byte valueType;
public final int size;
}
1.3 异常消息
applicationException作为包装异常的消息结构,当控制消息的类型为异常时,thrift便会将数据消息用applicationException进行封装,TApplicationException是applicationException的实现类,其中含有的read/write方法作用于上一篇处理层中的作用类似,调用传入的TProtocol类实现的read/write方法对数据消息进行序列化/反序列化。
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift;
import org.apache.thrift.protocol.TField;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.protocol.TProtocolUtil;
import org.apache.thrift.protocol.TStruct;
public class TApplicationException extends TException {
private static final TStruct TAPPLICATION_EXCEPTION_STRUCT = new TStruct("TApplicationException");
private static final TField MESSAGE_FIELD = new TField("message", (byte)11, (short)1);
private static final TField TYPE_FIELD = new TField("type", (byte)8, (short)2);
private static final long serialVersionUID = 1L;
public static final int UNKNOWN = 0;
public static final int UNKNOWN_METHOD = 1;
public static final int INVALID_MESSAGE_TYPE = 2;
public static final int WRONG_METHOD_NAME = 3;
public static final int BAD_SEQUENCE_ID = 4;
public static final int MISSING_RESULT = 5;
public static final int INTERNAL_ERROR = 6;
public static final int PROTOCOL_ERROR = 7;
protected int type_ = 0;
public TApplicationException() {
}
public TApplicationException(int type) {
this.type_ = type;
}
public TApplicationException(int type, String message) {
super(message);
this.type_ = type;
}
public TApplicationException(String message) {
super(message);
}
public int getType() {
return this.type_;
}
public static TApplicationException read(TProtocol iprot) throws TException {
iprot.readStructBegin();
String message = null;
int type = 0;
while(true) {
TField field = iprot.readFieldBegin();
if (field.type == 0) {
iprot.readStructEnd();
return new TApplicationException(type, message);
}
switch(field.id) {
case 1:
if (field.type == 11) {
message = iprot.readString();
} else {
TProtocolUtil.skip(iprot, field.type);
}
break;
case 2:
if (field.type == 8) {
type = iprot.readI32();
} else {
TProtocolUtil.skip(iprot, field.type);
}
break;
default:
TProtocolUtil.skip(iprot, field.type);
}
iprot.readFieldEnd();
}
}
public void write(TProtocol oprot) throws TException {
oprot.writeStructBegin(TAPPLICATION_EXCEPTION_STRUCT);
if (this.getMessage() != null) {
oprot.writeFieldBegin(MESSAGE_FIELD);
oprot.writeString(this.getMessage());
oprot.writeFieldEnd();
}
oprot.writeFieldBegin(TYPE_FIELD);
oprot.writeI32(this.type_);
oprot.writeFieldEnd();
oprot.writeFieldStop();
oprot.writeStructEnd();
}
}
2.thrift的协议
TProtocol是协议层的核心类,thrift中的数据序列化/反序列化操作都是通过该类完成的。我们先看下TProtocol的类图:
可以看到,TProtocol是一个抽象类,有4个子类继承了它,分别对应着thrift支持的4种协议(本文基于0.8.0版本):
- JSON
- simpleJSON
- Binary
- Compact
在上一节,我们介绍了thrift的消息结构,现在我们看下一条thrift消息的整体结构是什么样子:
<message> ::= <message-begin> <struct> <message-end>
<message-begin> ::= <method-name> <message-type> <message-seqid>
<method-name> ::= STRING
<message-type> ::= T_CALL | T_REPLY | T_EXCEPTION | T_ONEWAY
<message-seqid> ::= I32
<struct> ::= <struct-begin> <field>* <field-stop> <struct-end>
<struct-begin> ::= <struct-name>
<struct-name> ::= STRING
<field-stop> ::= T_STOP
<field> ::= <field-begin> <field-data> <field-end>
<field-begin> ::= <field-name> <field-type> <field-id>
<field-name> ::= STRING
<field-type> ::= T_BOOL | T_BYTE | T_I8 | T_I16 | T_I32 | T_I64 | T_DOUBLE
| T_STRING | T_BINARY | T_STRUCT | T_MAP | T_SET | T_LIST
<field-id> ::= I16
<field-data> ::= I8 | I16 | I32 | I64 | DOUBLE | STRING | BINARY
<struct> | <map> | <list> | <set>
<map> ::= <map-begin> <field-datum>* <map-end>
<map-begin> ::= <map-key-type> <map-value-type> <map-size>
<map-key-type> ::= <field-type>
<map-value-type> ::= <field-type>
<map-size> ::= I32
<list> ::= <list-begin> <field-data>* <list-end>
<list-begin> ::= <list-elem-type> <list-size>
<list-elem-type> ::= <field-type>
<list-size> ::= I32
<set> ::= <set-begin> <field-data>* <set-end>
<set-begin> ::= <set-elem-type> <set-size>
<set-elem-type> ::= <field-type>
<set-size> ::= I32
上面的代码已经很清晰的描述了thrift消息的结构,我们可以归纳一下,在thrift的消息结构中,无论什么层次、什么类型(struct/map/set/list/message/field)的数据,都可以被描述为begin-contains-end这种三段式结构。
2.1 二进制协议
二进制协议(binary protocol)是thrift的默认传输协议格式,其核心是将数据和描述信息转化为二进制进行传输,针对不同类型的数据,二进制协议有不同的转化规则:
数据类型 | 转化规则 |
integer | 基于大端模式进行字节码转化,int8 1字节,int16 2字节,int32 4字节,int64 8字节 |
enum | 基于该类型的原始值转化为int32类型 |
binary | Binary protocol, binary data, 4+ bytes: bytes: 字节数组 byte length:字节数组长度 |
string | 字符串首先被编码为utf-8格式,然后按照字节进行发送 |
double | 根据IEEE 754 将数据编码为int64格式,然后再将其根据整形方式进行转化为二进制,以大端模式排序 |
boolean | 转化为int8,1为true,0为false |
Message | Binary protocol Message, strict encoding, 12+ bytes:
|
struct | struct::= ( field-header field-value )* stop-field struct由多个field+停止符组成,field-header为field的头部数据结构,由field-type和field-id组成
Binary protocol field header and field value: +--------+--------+--------+--------+...+--------+ tttttttt:filed类型,8位,具体见下文; field id :idl定义的属性id,16位; field value :属性值
+--------+ |
list、set | Binary protocol list (5+ bytes) and elements:
|
map | Binary protocol map (6+ bytes) and key value pairs:
|
下面是各个类型的编码值:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
public final class TType {
public static final byte STOP = 0;
public static final byte VOID = 1;
public static final byte BOOL = 2;
public static final byte BYTE = 3;
public static final byte DOUBLE = 4;
public static final byte I16 = 6;
public static final byte I32 = 8;
public static final byte I64 = 10;
public static final byte STRING = 11;
public static final byte STRUCT = 12;
public static final byte MAP = 13;
public static final byte SET = 14;
public static final byte LIST = 15;
public static final byte ENUM = 16;
public TType() {
}
}
值得一提的是,thrift对struct的编码并不依赖于其顺序以及名称,仅依赖于其id,因此就算我们在idl中改变了元素顺序或者改变了元素名称,依然可以正确的进行数据传输,前提是id不变。
2.2 压缩协议
为了减少网络间传输的数据量,thrift提供了基于zigzag算法和var算法的压缩协议,该协议核心是通过zigzag算法将我们的二进制数据进行压缩后传输,关于zigzag算法与var算法,可以看这一篇博客,讲的非常清楚:https://blog.youkuaiyun.com/zgwangbo/article/details/51590186
数据类型 | 转化规则 |
integer | 通过zigzag与var算法,int32被压缩为1到5字节,int64被压缩为1到10字节,int16会先被转化为int32,然后再按照int32进行压缩,int8用1字节表示即可,无需压缩 |
enum | 基于该类型的原始值转化为int32类型,然后压缩 |
binary | Compact protocol, binary data, 1+ bytes: bytes: 字节数组 byte length:字节数组长度 |
string | 字符串首先被编码为utf-8格式,然后按照字节进行发送 |
double | 根据IEEE 754 将数据编码为int64格式,然后再将其根据整形方式进行转化为二进制,与二进制不同,由于早期的bug,压缩算法的double以小端模式排序 |
boolean | 转化为int8,1为true,0为false |
Message | Compact protocol Message, strict encoding, 12+ bytes:
|
struct | struct::= ( field-header field-value )* stop-field struct由多个field+停止符组成,field-header为field的头部数据结构,由field-type和field-id组成
Compact protocol field header and field value:
Compact protocol field header (short form) and field value: dddd:字段id符号,4位整型,严格正数 tttt:字段类型 field value :属性值 Compact protocol field header (1 to 3 bytes, long form) and field value:
+--------+ |
list、set | Compact protocol list header (1 byte, short form) and elements:
Compact protocol list header (2+ bytes, long form) and elements: |
map | map ::= empty-map | non-empty-map
Compact protocol map header (1 byte, empty map): Compact protocol map header (2+ bytes, non empty map) and key value pairs:
|
2.3 JSON协议
以json结构对数据进行序列化
3.协议层原理
首先,我们先看下TProtocol的类结构:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.thrift.protocol;
import java.nio.ByteBuffer;
import org.apache.thrift.TException;
import org.apache.thrift.scheme.IScheme;
import org.apache.thrift.scheme.StandardScheme;
import org.apache.thrift.transport.TTransport;
public abstract class TProtocol {
protected TTransport trans_;
private TProtocol() {
}
protected TProtocol(TTransport trans) {
this.trans_ = trans;
}
public TTransport getTransport() {
return this.trans_;
}
public abstract void writeMessageBegin(TMessage var1) throws TException;
public abstract void writeMessageEnd() throws TException;
public abstract void writeStructBegin(TStruct var1) throws TException;
public abstract void writeStructEnd() throws TException;
public abstract void writeFieldBegin(TField var1) throws TException;
public abstract void writeFieldEnd() throws TException;
public abstract void writeFieldStop() throws TException;
public abstract void writeMapBegin(TMap var1) throws TException;
public abstract void writeMapEnd() throws TException;
public abstract void writeListBegin(TList var1) throws TException;
public abstract void writeListEnd() throws TException;
public abstract void writeSetBegin(TSet var1) throws TException;
public abstract void writeSetEnd() throws TException;
public abstract void writeBool(boolean var1) throws TException;
public abstract void writeByte(byte var1) throws TException;
public abstract void writeI16(short var1) throws TException;
public abstract void writeI32(int var1) throws TException;
public abstract void writeI64(long var1) throws TException;
public abstract void writeDouble(double var1) throws TException;
public abstract void writeString(String var1) throws TException;
public abstract void writeBinary(ByteBuffer var1) throws TException;
public abstract TMessage readMessageBegin() throws TException;
public abstract void readMessageEnd() throws TException;
public abstract TStruct readStructBegin() throws TException;
public abstract void readStructEnd() throws TException;
public abstract TField readFieldBegin() throws TException;
public abstract void readFieldEnd() throws TException;
public abstract TMap readMapBegin() throws TException;
public abstract void readMapEnd() throws TException;
public abstract TList readListBegin() throws TException;
public abstract void readListEnd() throws TException;
public abstract TSet readSetBegin() throws TException;
public abstract void readSetEnd() throws TException;
public abstract boolean readBool() throws TException;
public abstract byte readByte() throws TException;
public abstract short readI16() throws TException;
public abstract int readI32() throws TException;
public abstract long readI64() throws TException;
public abstract double readDouble() throws TException;
public abstract String readString() throws TException;
public abstract ByteBuffer readBinary() throws TException;
public void reset() {
}
public Class<? extends IScheme> getScheme() {
return StandardScheme.class;
}
}
各个协议都实现了之前介绍的各个数据结构和数据类型的序列化与反序列化。不同协议的工作模式是相同的,我们以HelloService.sayHello_argsStandardScheme的read为例:
public void read(org.apache.thrift.protocol.TProtocol iprot, sayHello_args struct) throws org.apache.thrift.TException {
org.apache.thrift.protocol.TField schemeField;
iprot.readStructBegin();// 创建TStruct对象
while (true)
{
schemeField = iprot.readFieldBegin();// 创建TField对象
if (schemeField.type == org.apache.thrift.protocol.TType.STOP) {
break;// 遇到停止符则停止循环
}
switch (schemeField.id) {
case 1: // WORD
if (schemeField.type == org.apache.thrift.protocol.TType.STRING) {
struct.word = iprot.readString();// 如果是string,反序列化string
struct.setWordIsSet(true);
} else {
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);// 根据字段类型调用不同的方法进行反序列化
}
break;
default:
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);
}
iprot.readFieldEnd();// 结束field反序列化
}
iprot.readStructEnd();// 结束struct反序列化
// check for required fields of primitive type, which can't be checked in the validate method
struct.validate();
}
总体就是先从TProtocol对象持有的Ttransport对象中读取底层传输过来的序列化数据,然后按照结构顺序,structBegin->fieldBegin->readfield->fieldEnd->structEnd依次反序列化
4.总结
本篇介绍了thrift的协议层,首先介绍了thrift传输数据结构;其次介绍了thrift支持的各种协议;最后介绍了thrift协议层的执行流程