4.协议层

 

目录

1.thrift的消息结构

1.1 控制消息

1.2 数据消息

1.3 异常消息

2.thrift的协议

2.1 二进制协议

2.2 压缩协议

2.3 JSON协议

3.协议层原理


上一篇我们介绍到,服务端在调用处理层时,会调用处理层中Processor的process方法,该方法的请求参数中包含TProtocol对象,同时,客户端的client对象在构造方法中也需要传入TProtocol参数,在处理层中,客户端写入rpc接口的请求参数、获取返回结果、服务端获取请求参数、返回方法结果,最终都交给了TProtocol来进行。之前我们介绍过,thrift的协议层负责的任务就是根据用户指定的协议,将请求参数和返回结果进行序列化/反序列化,实现不同系统之间无障碍识别数据的目的。本篇我们来深入TProtocol的源码,看下协议层的机制是怎么样的。

1.thrift的消息结构

在了解协议层之前,我们要先了解thrift作为rpc框架,对传输的数据定义的结构是怎么样的。

在thrift中,传输的数据可以分为4个类型:

  • 控制消息,TMessage类,主要包含被调用的方法名,消息类型和序列号
  • 请求参数,即在处理层生成的xx_args
  • 返回结果,即在处理层生成的xx_result
  • 异常,TApplicationException类,包含异常类型和异常信息

1.1 控制消息

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package org.apache.thrift.protocol;

public final class TMessage {
    public final String name;// 消息名称,绝大情况下是调用的rpc接口的方法名
    public final byte type;// 消息类型
    public final int seqid;// 序列号

    public TMessage() {
        this("", (byte)0, 0);
    }

    public TMessage(String n, byte t, int s) {
        this.name = n;
        this.type = t;
        this.seqid = s;
    }

    public String toString() {
        return "<TMessage name:'" + this.name + "' type: " + this.type + " seqid:" + this.seqid + ">";
    }

    public boolean equals(Object other) {
        return other instanceof TMessage ? this.equals((TMessage)other) : false;
    }

    public boolean equals(TMessage other) {
        return this.name.equals(other.name) && this.type == other.type && this.seqid == other.seqid;
    }
}

1.2 数据消息

thrift中要传输的数据都是在idl中定义好的,thrift将要传输的数据按照idl中的类型定义,转换成java语言中的类型,从而转换成数据消息,除了基本类型之外,idl还支持struct、map和list三种特殊类型,thrift提供了以下4个类:

TStruct:结构体

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package org.apache.thrift.protocol;

public final class TStruct {
    public final String name;

    public TStruct() {
        this("");
    }

    public TStruct(String n) {
        this.name = n;
    }
}

TField:结构体属性

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements. See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership. The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied. See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.apache.thrift.protocol;

/**
 * Helper class that encapsulates field metadata.
 *
 */
public class TField {
  public TField() {
    this("", TType.STOP, (short)0);
  }

  public TField(String n, byte t, short i) {
    name = n;
    type = t;
    id = i;
  }

  public final String name;
  public final byte   type;
  public final short  id;

  public String toString() {
    return "<TField name:'" + name + "' type:" + type + " field-id:" + id + ">";
  }

  public boolean equals(TField otherField) {
    return type == otherField.type && id == otherField.id;
  }
}

TList:list

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements. See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership. The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied. See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.apache.thrift.protocol;

/**
 * Helper class that encapsulates list metadata.
 *
 */
public final class TList {
  public TList() {
    this(TType.STOP, 0);
  }

  public TList(byte t, int s) {
    elemType = t;
    size = s;
  }

  public final byte elemType;
  public final int  size;
}

TMap:map

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements. See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership. The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied. See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.apache.thrift.protocol;

/**
 * Helper class that encapsulates map metadata.
 *
 */
public final class TMap {
  public TMap() {
    this(TType.STOP, TType.STOP, 0);
  }

  public TMap(byte k, byte v, int s) {
    keyType = k;
    valueType = v;
    size = s;
  }

  public final byte  keyType;
  public final byte  valueType;
  public final int   size;
}

1.3 异常消息

applicationException作为包装异常的消息结构,当控制消息的类型为异常时,thrift便会将数据消息用applicationException进行封装,TApplicationException是applicationException的实现类,其中含有的read/write方法作用于上一篇处理层中的作用类似,调用传入的TProtocol类实现的read/write方法对数据消息进行序列化/反序列化。

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package org.apache.thrift;

import org.apache.thrift.protocol.TField;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.protocol.TProtocolUtil;
import org.apache.thrift.protocol.TStruct;

public class TApplicationException extends TException {
    private static final TStruct TAPPLICATION_EXCEPTION_STRUCT = new TStruct("TApplicationException");
    private static final TField MESSAGE_FIELD = new TField("message", (byte)11, (short)1);
    private static final TField TYPE_FIELD = new TField("type", (byte)8, (short)2);
    private static final long serialVersionUID = 1L;
    public static final int UNKNOWN = 0;
    public static final int UNKNOWN_METHOD = 1;
    public static final int INVALID_MESSAGE_TYPE = 2;
    public static final int WRONG_METHOD_NAME = 3;
    public static final int BAD_SEQUENCE_ID = 4;
    public static final int MISSING_RESULT = 5;
    public static final int INTERNAL_ERROR = 6;
    public static final int PROTOCOL_ERROR = 7;
    protected int type_ = 0;

    public TApplicationException() {
    }

    public TApplicationException(int type) {
        this.type_ = type;
    }

    public TApplicationException(int type, String message) {
        super(message);
        this.type_ = type;
    }

    public TApplicationException(String message) {
        super(message);
    }

    public int getType() {
        return this.type_;
    }

    public static TApplicationException read(TProtocol iprot) throws TException {
        iprot.readStructBegin();
        String message = null;
        int type = 0;

        while(true) {
            TField field = iprot.readFieldBegin();
            if (field.type == 0) {
                iprot.readStructEnd();
                return new TApplicationException(type, message);
            }

            switch(field.id) {
            case 1:
                if (field.type == 11) {
                    message = iprot.readString();
                } else {
                    TProtocolUtil.skip(iprot, field.type);
                }
                break;
            case 2:
                if (field.type == 8) {
                    type = iprot.readI32();
                } else {
                    TProtocolUtil.skip(iprot, field.type);
                }
                break;
            default:
                TProtocolUtil.skip(iprot, field.type);
            }

            iprot.readFieldEnd();
        }
    }

    public void write(TProtocol oprot) throws TException {
        oprot.writeStructBegin(TAPPLICATION_EXCEPTION_STRUCT);
        if (this.getMessage() != null) {
            oprot.writeFieldBegin(MESSAGE_FIELD);
            oprot.writeString(this.getMessage());
            oprot.writeFieldEnd();
        }

        oprot.writeFieldBegin(TYPE_FIELD);
        oprot.writeI32(this.type_);
        oprot.writeFieldEnd();
        oprot.writeFieldStop();
        oprot.writeStructEnd();
    }
}

2.thrift的协议

TProtocol是协议层的核心类,thrift中的数据序列化/反序列化操作都是通过该类完成的。我们先看下TProtocol的类图:

可以看到,TProtocol是一个抽象类,有4个子类继承了它,分别对应着thrift支持的4种协议(本文基于0.8.0版本):

  • JSON
  • simpleJSON
  • Binary
  • Compact

在上一节,我们介绍了thrift的消息结构,现在我们看下一条thrift消息的整体结构是什么样子:

    <message> ::= <message-begin> <struct> <message-end>

 <message-begin> ::= <method-name> <message-type> <message-seqid>

   <method-name> ::= STRING

  <message-type> ::= T_CALL | T_REPLY | T_EXCEPTION | T_ONEWAY

 <message-seqid> ::= I32

        <struct> ::= <struct-begin> <field>* <field-stop> <struct-end>

  <struct-begin> ::= <struct-name>

   <struct-name> ::= STRING

    <field-stop> ::= T_STOP

         <field> ::= <field-begin> <field-data> <field-end>

   <field-begin> ::= <field-name> <field-type> <field-id>

    <field-name> ::= STRING

    <field-type> ::= T_BOOL | T_BYTE | T_I8 | T_I16 | T_I32 | T_I64 | T_DOUBLE
                     | T_STRING | T_BINARY | T_STRUCT | T_MAP | T_SET | T_LIST

      <field-id> ::= I16

    <field-data> ::= I8 | I16 | I32 | I64 | DOUBLE | STRING | BINARY
                     <struct> | <map> | <list> | <set>

           <map> ::= <map-begin> <field-datum>* <map-end>

     <map-begin> ::= <map-key-type> <map-value-type> <map-size>

  <map-key-type> ::= <field-type>

<map-value-type> ::= <field-type>

      <map-size> ::= I32

          <list> ::= <list-begin> <field-data>* <list-end>

    <list-begin> ::= <list-elem-type> <list-size>

<list-elem-type> ::= <field-type>

     <list-size> ::= I32

           <set> ::= <set-begin> <field-data>* <set-end>

     <set-begin> ::= <set-elem-type> <set-size>

 <set-elem-type> ::= <field-type>

      <set-size> ::= I32

上面的代码已经很清晰的描述了thrift消息的结构,我们可以归纳一下,在thrift的消息结构中,无论什么层次、什么类型(struct/map/set/list/message/field)的数据,都可以被描述为begin-contains-end这种三段式结构。

2.1 二进制协议

二进制协议(binary protocol)是thrift的默认传输协议格式,其核心是将数据和描述信息转化为二进制进行传输,针对不同类型的数据,二进制协议有不同的转化规则:

数据类型转化规则
integer基于大端模式进行字节码转化,int8 1字节,int16 2字节,int32 4字节,int64 8字节
enum基于该类型的原始值转化为int32类型
binary

Binary protocol, binary data, 4+ bytes:                                     bytes: 字节数组       byte length:字节数组长度                                                               
+--------+--------+--------+--------+--------+...+--------+
| byte length                       | bytes                |
+--------+--------+--------+--------+--------+...+--------+

string字符串首先被编码为utf-8格式,然后按照字节进行发送
double根据IEEE 754 将数据编码为int64格式,然后再将其根据整形方式进行转化为二进制,以大端模式排序
boolean转化为int8,1为true,0为false
Message

Binary protocol Message, strict encoding, 12+ bytes:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+...+--------+--------+--------+--------+--------+
|1vvvvvvv|vvvvvvvv|unused  |00000mmm| name length                       | name                | seq id                            |
+--------+--------+--------+--------+--------+--------+--------+--------+--------+...+--------+--------+--------+--------+--------+

  • 1vvvvvvvvvvvvvvv:版本号,除去1,共15位
  • unused:1字节,保留字节
  • mmm:消息类型
  • name length:name区域的字节长度,共32位,4字节
  • name:方法名,是一个utf-8格式的字符串
  • seq id:序列号,32位整型
struct

struct::= ( field-header field-value )* stop-field          struct由多个field+停止符组成,field-header为field的头部数据结构,由field-type和field-id组成
field-header::= field-type field-id

  • field:

Binary protocol field header and field value:

+--------+--------+--------+--------+...+--------+                  tttttttt:filed类型,8位,具体见下文;  field id :idl定义的属性id,16位;  field value :属性值
|tttttttt| field id        | field value         |
+--------+--------+--------+--------+...+--------+

 

  • stop-field:结束符

+--------+
|00000000|
+--------+

list、set

Binary protocol list (5+ bytes) and elements:
+--------+--------+--------+--------+--------+--------+...+--------+
|tttttttt| size                              | elements            |
+--------+--------+--------+--------+--------+--------+...+--------+

  • tttttttt:类型值
  • size:元素总大小
  • elements:元素值
map

Binary protocol map (6+ bytes) and key value pairs:
+--------+--------+--------+--------+--------+--------+--------+...+--------+
|kkkkkkkk|vvvvvvvv| size                              | key value pairs     |
+--------+--------+--------+--------+--------+--------+--------+...+--------+

  • kkkkkkkk:key类型
  • vvvvvvvv: value类型
  • size:总大小
  • key value pairs :键值对值

下面是各个类型的编码值:

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package org.apache.thrift.protocol;

public final class TType {
    public static final byte STOP = 0;
    public static final byte VOID = 1;
    public static final byte BOOL = 2;
    public static final byte BYTE = 3;
    public static final byte DOUBLE = 4;
    public static final byte I16 = 6;
    public static final byte I32 = 8;
    public static final byte I64 = 10;
    public static final byte STRING = 11;
    public static final byte STRUCT = 12;
    public static final byte MAP = 13;
    public static final byte SET = 14;
    public static final byte LIST = 15;
    public static final byte ENUM = 16;

    public TType() {
    }
}

值得一提的是,thrift对struct的编码并不依赖于其顺序以及名称,仅依赖于其id,因此就算我们在idl中改变了元素顺序或者改变了元素名称,依然可以正确的进行数据传输,前提是id不变。

2.2 压缩协议

为了减少网络间传输的数据量,thrift提供了基于zigzag算法和var算法的压缩协议,该协议核心是通过zigzag算法将我们的二进制数据进行压缩后传输,关于zigzag算法与var算法,可以看这一篇博客,讲的非常清楚:https://blog.youkuaiyun.com/zgwangbo/article/details/51590186

数据类型转化规则
integer通过zigzag与var算法,int32被压缩为1到5字节,int64被压缩为1到10字节,int16会先被转化为int32,然后再按照int32进行压缩,int8用1字节表示即可,无需压缩
enum基于该类型的原始值转化为int32类型,然后压缩
binary

Compact protocol, binary data, 1+ bytes:                                     bytes: 字节数组       byte length:字节数组长度                                                               
+--------+--------+--------+--------+--------+...+--------+
| byte length                       | bytes                |
+--------+--------+--------+--------+--------+...+--------+

string字符串首先被编码为utf-8格式,然后按照字节进行发送
double根据IEEE 754 将数据编码为int64格式,然后再将其根据整形方式进行转化为二进制,与二进制不同,由于早期的bug,压缩算法的double以小端模式排序
boolean转化为int8,1为true,0为false
Message

Compact protocol Message, strict encoding, 12+ bytes:
+--------+--------+--------+...+--------+--------+...+--------+--------+...+--------+
|pppppppp|mmmvvvvv| seq id              | name length         | name                |
+--------+--------+--------+...+--------+--------+...+--------+--------+...+--------+

  • pppppppp:协议id,固定为0x82,1000 0010
  • mmm:3位消息类型
  • vvvvv:版本号,无符号5位整型,固定为0001
  • name length:name区域的字节长度,共32位,4字节
  • name:方法名,是一个utf-8格式的字符串
  • seq id:序列号,32位整型,并被压缩
struct

struct::= ( field-header field-value )* stop-field          struct由多个field+停止符组成,field-header为field的头部数据结构,由field-type和field-id组成
field-header::= field-type field-id

  • field:

Compact protocol field header and field value:

     

Compact protocol field header (short form) and field value:
+--------+--------+...+--------+
|ddddtttt| field value         |                               
+--------+--------+...+--------+

dddd:字段id符号,4位整型,严格正数

tttt:字段类型

field value :属性值

Compact protocol field header (1 to 3 bytes, long form) and field value:
+--------+--------+...+--------+--------+...+--------+
|0000tttt| field id            | field value         |
+--------+--------+...+--------+--------+...+--------+ 

  • stop-field:结束符

+--------+
|00000000|
+--------+

list、set

Compact protocol list header (1 byte, short form) and elements:
+--------+--------+...+--------+
|sssstttt| elements            |
+--------+--------+...+--------+

  • ssss:元素大小
  • tttt:元素类型
  • elements:元素值

Compact protocol list header (2+ bytes, long form) and elements:
+--------+--------+...+--------+--------+...+--------+
|1111tttt| size                | elements            |
+--------+--------+...+--------+--------+...+--------+

map

map           ::= empty-map | non-empty-map
empty-map     ::= `0`
non-empty-map ::= size key-element-type value-element-type (key value)+

 

Compact protocol map header (1 byte, empty map):
+--------+
|00000000|
+--------+

Compact protocol map header (2+ bytes, non empty map) and key value pairs:
+--------+...+--------+--------+--------+...+--------+
| size                |kkkkvvvv| key value pairs     |
+--------+...+--------+--------+--------+...+--------+

  • kkkk:key类型
  • vvvv: value类型
  • size:总大小
  • key value pairs :键值对值

2.3 JSON协议

以json结构对数据进行序列化

3.协议层原理

首先,我们先看下TProtocol的类结构:

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package org.apache.thrift.protocol;

import java.nio.ByteBuffer;
import org.apache.thrift.TException;
import org.apache.thrift.scheme.IScheme;
import org.apache.thrift.scheme.StandardScheme;
import org.apache.thrift.transport.TTransport;

public abstract class TProtocol {
    protected TTransport trans_;

    private TProtocol() {
    }

    protected TProtocol(TTransport trans) {
        this.trans_ = trans;
    }

    public TTransport getTransport() {
        return this.trans_;
    }

    public abstract void writeMessageBegin(TMessage var1) throws TException;

    public abstract void writeMessageEnd() throws TException;

    public abstract void writeStructBegin(TStruct var1) throws TException;

    public abstract void writeStructEnd() throws TException;

    public abstract void writeFieldBegin(TField var1) throws TException;

    public abstract void writeFieldEnd() throws TException;

    public abstract void writeFieldStop() throws TException;

    public abstract void writeMapBegin(TMap var1) throws TException;

    public abstract void writeMapEnd() throws TException;

    public abstract void writeListBegin(TList var1) throws TException;

    public abstract void writeListEnd() throws TException;

    public abstract void writeSetBegin(TSet var1) throws TException;

    public abstract void writeSetEnd() throws TException;

    public abstract void writeBool(boolean var1) throws TException;

    public abstract void writeByte(byte var1) throws TException;

    public abstract void writeI16(short var1) throws TException;

    public abstract void writeI32(int var1) throws TException;

    public abstract void writeI64(long var1) throws TException;

    public abstract void writeDouble(double var1) throws TException;

    public abstract void writeString(String var1) throws TException;

    public abstract void writeBinary(ByteBuffer var1) throws TException;

    public abstract TMessage readMessageBegin() throws TException;

    public abstract void readMessageEnd() throws TException;

    public abstract TStruct readStructBegin() throws TException;

    public abstract void readStructEnd() throws TException;

    public abstract TField readFieldBegin() throws TException;

    public abstract void readFieldEnd() throws TException;

    public abstract TMap readMapBegin() throws TException;

    public abstract void readMapEnd() throws TException;

    public abstract TList readListBegin() throws TException;

    public abstract void readListEnd() throws TException;

    public abstract TSet readSetBegin() throws TException;

    public abstract void readSetEnd() throws TException;

    public abstract boolean readBool() throws TException;

    public abstract byte readByte() throws TException;

    public abstract short readI16() throws TException;

    public abstract int readI32() throws TException;

    public abstract long readI64() throws TException;

    public abstract double readDouble() throws TException;

    public abstract String readString() throws TException;

    public abstract ByteBuffer readBinary() throws TException;

    public void reset() {
    }

    public Class<? extends IScheme> getScheme() {
        return StandardScheme.class;
    }
}

各个协议都实现了之前介绍的各个数据结构和数据类型的序列化与反序列化。不同协议的工作模式是相同的,我们以HelloService.sayHello_argsStandardScheme的read为例:


      public void read(org.apache.thrift.protocol.TProtocol iprot, sayHello_args struct) throws org.apache.thrift.TException {
        org.apache.thrift.protocol.TField schemeField;
        iprot.readStructBegin();// 创建TStruct对象
        while (true)
        {
          schemeField = iprot.readFieldBegin();// 创建TField对象
          if (schemeField.type == org.apache.thrift.protocol.TType.STOP) {
            break;// 遇到停止符则停止循环
          }
          switch (schemeField.id) {
            case 1: // WORD
              if (schemeField.type == org.apache.thrift.protocol.TType.STRING) {
                struct.word = iprot.readString();// 如果是string,反序列化string
                struct.setWordIsSet(true);
              } else {
                org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);// 根据字段类型调用不同的方法进行反序列化
              }
              break;
            default:
              org.apache.thrift.protocol.TProtocolUtil.skip(iprot, schemeField.type);
          }
          iprot.readFieldEnd();// 结束field反序列化
        }
        iprot.readStructEnd();// 结束struct反序列化

        // check for required fields of primitive type, which can't be checked in the validate method
        struct.validate();
      }

总体就是先从TProtocol对象持有的Ttransport对象中读取底层传输过来的序列化数据,然后按照结构顺序,structBegin->fieldBegin->readfield->fieldEnd->structEnd依次反序列化

4.总结

本篇介绍了thrift的协议层,首先介绍了thrift传输数据结构;其次介绍了thrift支持的各种协议;最后介绍了thrift协议层的执行流程

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值