【sylar】框架篇-Chapter12-ByteArray 模块

最新推荐文章于 2022-12-25 16:19:43 发布

原创

最新推荐文章于 2022-12-25 16:19:43 发布 · 368 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#linux #服务器 #c++

本文介绍了C++编写的高性能分布式服务器框架中的ByteArray类，这是一个用于序列化和反序列化的二进制数组。ByteArray通过链表结构存储数据，支持固定和可变长度的数据类型，包括整数、浮点数和字符串。文章指出ByteArray在读取大块数据时可能存在两个潜在的bug，并提供了相关代码。此外，详细展示了ByteArray的序列化算法，如Zigzag编码和TLV结构。通过对ByteArray的理解和优化，读者可以提升其在分布式系统中的数据处理效率。

站在巨人的肩膀上

C++高性能分布式服务器框架

从零开始重写sylar C++高性能分布式服务器框架

概述

二进制数组（序列化/反序列化）模块。
字节数组容器，提供基础类型的序列化与反序列化功能。
ByteArray 的底层存储是固定大小的块，以链表形式组织。每次写入数据时，将数据写入到链表最后一个块中，如果最后一个块不足以容纳数据，则分配一个新的块并添加到链表结尾，再写入数据。ByteArray 会记录当前的操作位置，每次写入数据时，该操作位置按写入大小往后偏移，如果要读取数据，则必须调用 setPosition 重新设置当前的操作位置。
ByteArray 支持基础类型的序列化与反序列化功能，并且支持将序列化的结果写入文件，以及从文件中读取内容进行反序列化。
ByteArray 支持以下类型的序列化与反序列化：
- 固定长度的有符号/无符号8位、16位、32位、64位整数。
- 不固定长度的有符号/无符号32位、64位整数。
- float、double 类型。
- 字符串，包含字符串长度，长度范围支持16位、32位、64位。
- 字符串，不包含长度。
- 以上所有的类型都支持读写。
ByteArray 还支持设置序列化时的大小端顺序。

ByteArray

二进制数组，提供基础类型的序列化，反序列化功能。
内含一个 Node 类，作为 ByteArray 的存储节点。
相当于一个链表的数据结构。

其他说明

ByteArray 在序列化不固定长度的有符号/无符号32位、64位整数时使用了 zigzag 算法。zigzag 算法参考：小而巧的数字压缩算法：zigzag
ByteArray 在序列化字符串时使用 TLV 编码结构中的 Length 和 Value。TLV编码结构用于序列化和消息传递，指 Tag（类型），Length（长度），Value（值），参考：TLV编码通信协议设计

待完善

感觉这个模块有两个地方有bug。

ByteArray::getReadBuffers 方法：

uint64_t ByteArray::getReadBuffers(std::vector<iovec>& buffers, uint64_t len, uint64_t position) const
{
// 这里的判断好像没啥用，因为下面用的position是外面传进来的，
// 与m_position是没有关系的，bug???
len = len > getReadSize() ? getReadSize() : len;
...
}

ByteArray::writeToFile 方法：

bool ByteArray::writeToFile(const std::string& name) const
{
...
int64_t read_size = getReadSize();
int64_t pos = m_position;
Node* cur = m_cur;
while(read_size > 0)
{
    int diff = pos % m_baseSize;    // 当前块不可读(已经读完)的字节数

    // 1、要读的内容比一整块要大，则读完一整块剩下的那些
    // 2、要读的内容比一整块少，则读？？？
    // 假设一块10字节，可读9字节，不可读是3字节，则读9-3=6字节，bug???
    // 假设一块10字节，可读1字节，不可读是3字节，则读1-3=-5字节，bug???
    // 应该是：
    // int64_t len = 0;
    // if(read_size > (int64_t)m_baseSize - diff)
    //      len = (int64_t)m_baseSize - diff;
    // else
    //      len = read_size;
    int64_t len = (read_size > (int64_t)m_baseSize ? m_baseSize : read_size) - diff;
    ofs.write(cur->ptr + diff, len);
    cur = cur->next;
    pos += len;
    read_size -= len;
}

return true;
}

部分相关代码

/**
 * @filename    bytearray.h
 * @brief   ByteArray 模块
 * @author  L-ge
 * @version 0.1
 * @modify  2022-07-10
 */
#ifndef __SYLAR_BYTEARRAY_H__
#define __SYLAR_BYTEARRAY_H__

#include <memory>
#include <string>
#include <vector>
#include <stdint.h>
#include <sys/types.h>
#include <sys/socket.h>

namespace sylar
{

/**
 * @brief   二进制数组，提供基础类型的序列化和反序列化功能
 */
class ByteArray
{
public:
    typedef std::shared_ptr<ByteArray> ptr;

    /**
     * @brief  存储节点 
     */
    struct Node
    {
        /**
         * @brief  构造指定大小的内存块 
         *
         * @param   s   内存块字节数
         */
        Node(size_t s);
        Node();
        ~Node();

        /// 内存块地址指针
        char* ptr;
        /// 下一个内存块地址
        Node* next;
        /// 内存块大小
        size_t size;
    };

    /**
     * @brief   按指定大小的内存块构造
     *
     * @param   base_size   内存块大小
     */
    ByteArray(size_t base_size = 4096);
    ~ByteArray();

    /**
     * @brief  写入固定长度类型的整型数据 
     */
    void writeFint8(int8_t value);
    void writeFuint8(uint8_t value);
    void writeFint16(int16_t value);
    void writeFuint16(uint16_t value);
    void writeFint32(int32_t value);
    void writeFuint32(uint32_t value);
    void writeFint64(int64_t value);
    void writeFuint64(uint64_t value);

    /**
     * @brief  写入可压缩(Zigzag算法)的字符串数据 
     */
    void writeInt32(int32_t value);
    void writeUint32(uint32_t value);
    void writeInt64(int64_t value);
    void writeUint64(uint64_t value);
    
    void writeFloat(float value);
    void writeDouble(double value);

    /**
     * @brief  写入前面带长度(长度所占字节数固定)的字符串数据(长度+实际数据) 
     */
    void writeStringF16(const std::string& value);
    void writeStringF32(const std::string& value);
    void writeStringF64(const std::string& value);
    
    /**
     * @brief  写入前面带长度的数据(实际长度+实际数据) 
     *         长度所占字节数为可压缩的uint64的实际大小
     */
    void writeStringVint(const std::string& value);

    /**
     * @brief   写入不带长度的字符串数据
     */
    void writeStringWithoutLength(const std::str