偶遇串行化 Serializer

本文讲述了作者在LeetCode上遇到的关于二叉树串行化和解串行化的难题,探讨了如何使用C++实现这一功能,包括处理NULL指针、使用deque和string等数据结构,以及解决解串行化的递归问题。通过实践,作者实现了高效的串行化和解串行化算法,并提供了完整的程序代码。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

偶遇串行化 Serializer

  • MD doCumEnT: 3/13/2016 6:06:17 AM by Jimbowhy

我其实是挺理解老外大胡子编程人士为何爱用“f**king code”来描述那种状态,因为有时候眼手一起码上了劲,就会真有那种感觉的啊! - by Jimbowhy, 3/13/2016 7:55:10 AM

偶遇串行化

玩 LeetCode 的过程中,无意打开了模拟面试的功能 Mock Interview,出现一条60分钟的试题,喔!串行化,作为一个编程领域必备技术,串行化的功能最能体现威力的就有远程对象传输,也就是说通过网络,将本机运程的程序对象发送给另一个正在运行的程序,是不是很棒!作为 MFC 六大核心机制之一的串行化,也用于 MFC 体系中的文件存储,总之串行化和解串行是令我兴奋的技术之一。今天就要来 LeetCode 解一解串行化的题目,原题内容:

No. 297 Serialize and Deserialize Binary Tree My Submissions Question
Total Accepted: 15172 Total Submissions: 56198 Difficulty: Hard

Remaining time: 38 minutes, 29 seconds.
Serialize and Deserialize Binary Tree
Difficulty: Hard

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

For example, you may serialize the following tree

    1
   / \
  2   3
     / \
    4   5

as “[1,2,3,null,null,4,5]”, just the same as how LeetCode OJ serializes a binary tree. You do not necessarily need to follow this format, so please be creative and come up with different approaches yourself.
Note: Do not use class member/global/static variables to store states. Your serialize and deserialize algorithms should be stateless.

Credits:
Special thanks to @Louis1992 for adding this problem and creating all test cases.

题目大意是,提供一个二叉树对象,实现它的串行化与解串行。边带之前的通配符试题,还有 Cross Self 等等有趣的题目,在 LeetCode 上玩的这几个星期确实让我过足了瘾!之前还用关系 Forth、Mathematica图像处理的文章未完成,做完 LeetCode 这道题就要暂停一阵了。

还有优快云的问答,动不却就封,太不近情理了,我连因为什么原因被封的都不知道!

编码过程

这道同样是 Hard 的题目花了不少时间,借此机会熟习了一翻C++的一些常用基础类,包含用 string 来处理二进制数据,用 deque 的双向堆栈结构来处理解串行遇到的问题。deque和vector还有list组成了STL的三大链接结构对象,这些是必需的工具类,但要用好它们可不简单,不过掌握了相应的数据结构,它们也不难。其它教材总是爱用容器来形容它们,我却太接收不了了,链表数据结构好像也跟容器扯不上多大关系吧!期间还使用了 stringstream 来尝试打印二叉树结构图,可惜不成功。以下就是用来测试 string 处理二进制数据的代码,还挺管用:

void test(){
    char data[] = {'A','b',0x00,'C','D'};
    string a(data), b(data,sizeof(data)), c(5, 0x00);
    cout<< "string a: " << a << "\t" << a.size() <<endl;
    cout<< "string b: " << b << "\t" << b.size() <<endl;
    int a1st = (int)a.data();
    int a2nd = (int)a.data()+1;
    memcpy( (void*)c.data(), data, 5);
    cout << "string a+b: " <<  b+c << "\t" << (b+c).length() << endl;
    cout<< c << "\t" << c.size() << "\taddress: " << a1st << "\t" << a2nd <<endl;
    //cout<< (void*)a.data()<< "\t" << (void*)a.data()+1 <<endl;
}

在开始编码的时候,竟然还遇到了 NULL 无定义问题,在C++的头文件 cstdio 中是这样定义的,以后就算没有头文件也可以手动定义 NULL 指针:

#ifndef NULL
#ifdef __cplusplus
#define NULL 0
#else
#define NULL ((void *)0)
#endif
#endif

题目给定了这样的一个链表结构体,由它构成题目输入数据中出现的二叉树对象:

typedef struct TreeNode{
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int x) : val(x), left( NULL ), right( NULL) {}
} TreeNode;

对于二叉树的处理,一开始就想到用递归来串行化,虽然前面的文章在批递归怎么坏,但是这种问题递归才是最有效率的,不用递归来串行化二叉树那就是给自己找事。为了实现串行,考虑二叉树可能是不完全的树,肯定有枝点缺失的情况。因此定义串行化数据结构时,使用了一个幻数,只要一个字节就可以表示节点包含左右节点的情况。这个幻数就和二叉树的节点数据保存为一个数据单元,通过递归将这些数据单元拼接为一个整体。整个串行化的代码编写显得相当得心就手,所少用到的就是用 memcpy 来拷贝整型数值到字符串类中。幻数定义了三个值,其实是两个比特位,最低位表示右节点的状态,第二位表示左节点的状态,两个比特位可以表示四种状态,只要对应位置位,就表示拥有某子节点:

string serialize(TreeNode* root) {
    if( root==NULL ) return string("");
    TreeNode &rt = *root;
    int msize = sizeof( ((TreeNode)0).val );
    char magic = 0x00;
    string s(msize+1, 'x');
    if( rt.left != NULL ){
        magic = magic | 0x02;
        s += serialize(rt.left);
    }
    if( rt.right!= NULL){
        magic = magic | 0x01;
        s += serialize(rt.right);
    }
    char * pd = (char *)s.data();
    char * pval = pd + 1;
    memcpy( pd,   &magic, 1);
    memcpy( pval, &rt.val, 4);
    return s;
}

在解串行化时问题就来了,题目给定的函数定义是这样的:

// Decodes your encoded data to tree.
TreeNode* deserialize(string data) {
    //...
}

只接收一个参数,没有多余的施展空间,由于串行化数据是递归构造的,而且大左侧的节点要先于右侧节点输入输出,如果按现有的解串行方法定义肯定行不通,我也在想就按题目给定的函数定义能不能做呢?想来想去还真的头痛的,没门路。要么和 LeetCode 演示的那样按二叉树的层级进行解串行,可是串行函数已经是按递归设计了,不用递归也不太对。好吧,另定义一个函数来做解串行化的工作吧,保持题目给的函数,最简单有效的办法就是重载一个解串行化函数:

TreeNode* deserialize( string &data, int &index ){
    int val, msize = 4; //sizeof( ((TreeNode)0).val );
    memcpy( &val, (void *)(data.data()+index+1), sizeof(int) );
    char magic = data[index];
    TreeNode *root = new TreeNode( val );
    TreeNode np = *root;
    index += msize + 1;
    if( magic & 0x02 ){
        root->left = deserialize( data, index );
    }
    if( magic & 0x01 ){
        root->right = deserialize( data, index );
    }
    return root;
}

有了这个函数,就是基本已经实现题目的要求了,对于给定的解串行化函数,只需要添加几行预备代码就可以运行了:

// Decodes your encoded data to tree.
TreeNode* deserialize(string data) {
    if(data.length()==0) return NULL;
    int index = 0;
    return deserialize(data, index);
}

在编写代码的过程中,还提出想要实现二叉树的打印功能,希望可以打印出以线条连接子节点的字符图形,但几次尝试下来,还是做不到。只好退而求次,通过分层打印的方法来罗列各节点元素,用左右箭括号来代码是否含有左、右子节点:

void print(TreeNode root){
    deque<TreeNode*> vn;
    int l=0, n=0, max_loop = 0xffff;

    vn.push_back(&root);

    while(--max_loop){
        n = vn.size();
        if( n<=0 ) break;
        while(n--){
            TreeNode &tn = *vn.front();
            vn.pop_front();
            if( tn.left ) cout << "<";
            cout << (char)tn.val << hex << "[" << tn.val << "]";
            if( tn.right ) cout << ">";
            cout << " ";
            if( tn.left!=NULL ) vn.push_back(tn.left);
            if( tn.right!=NULL ) vn.push_back(tn.right);
        }
        cout << endl;
    }
}

这个方法就使用了 deque 双向堆栈结构,每扫描二叉树层时,同时又在做打印输出,所以输入输出是同时处理的,这就相对高效率一点。测试的数据输出如下:

Source TreeNode:
<A[41]>
<x[78]> <x[78]>
<y[79]> <y[79]> <y[79]> <y[79]>
z[7a] z[7a] z[7a] z[7a] z[7a] z[7a] z[7a] z[7a]
Serialized:
A   x   y    z    z   y    z    z   x   y    z    z   y    z    z
And this deserialized:
<A[41]>
<x[78]> <x[78]>
<y[79]> <y[79]> <y[79]> <y[79]>
z[7a] z[7a] z[7a] z[7a] z[7a] z[7a] z[7a] z[7a]

完整程序代码

/*
 * Serialize & deserialize demo by Jimbowhy
 * 3/13/2016 7:19:17 AM
 * compile: cls && g++ -o Serializer Serializer.cpp && Serializer.exe
 */ 

#include <iostream>
#include <string>
#include <deque>
#include <cstdio>

using namespace std;

/**
 * Definition for a binary tree node.
 */
typedef struct TreeNode{
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int x) : val(x), left( NULL ), right( NULL) {}
} TreeNode;

class Codec {
public:

    /*
     * Encodes a tree to a single string.
     * DATA FORMAT:
     * Byte+Left+Right+Byte+Left+Right, 
     * BYTE FORMAT:
     * 0x01 has right, 0x02 has left, 0x3 both left & right
     */
    string serialize(TreeNode* root) {
        if( root==NULL ) return string("");
        TreeNode &rt = *root;
        int msize = sizeof( ((TreeNode)0).val );
        char magic = 0x00;
        string s(msize+1, 'x');
        if( rt.left != NULL ){
            magic = magic | 0x02;
            s += serialize(rt.left);
        }
        if( rt.right!= NULL){
            magic = magic | 0x01;
            s += serialize(rt.right);
        }
        char * pd = (char *)s.data();
        char * pval = pd + 1;
        memcpy( pd,   &magic, 1);
        memcpy( pval, &rt.val, 4);
        return s;
    }

    // Decodes your encoded data to tree.
    TreeNode* deserialize(string data) {
        if(data.length()==0) return NULL;
        int index = 0;
        return deserialize(data, index);
    }

    TreeNode* deserialize( string &data, int &index ){
        int val, msize = 4; //sizeof( ((TreeNode)0).val );
        //int val = (int)(data[index+1]); // it's not working
        memcpy( &val, (void *)(data.data()+index+1), sizeof(int) );
        char magic = data[index];
        TreeNode *root = new TreeNode( val );
        TreeNode np = *root;
        index += msize + 1;
        if( magic & 0x02 ){
            root->left = deserialize( data, index );
        }
        if( magic & 0x01 ){
            root->right = deserialize( data, index );
        }
        return root;
    }

    // build tree.
    TreeNode * build(int i, int e, TreeNode * root) {
        TreeNode *r = new TreeNode(i);
        TreeNode *l = new TreeNode(i);
        root->left = l;
        root->right = r;
        if(i<e){
            build(i+1, e, l);
            build(i+1, e, r);
        }
        return root;
    }

    void print(TreeNode root){
        deque<TreeNode*> vn;
        int l=0, n=0, max_loop = 0xffff;

        vn.push_back(&root);

        while(--max_loop){
            n = vn.size();
            if( n<=0 ) break;
            while(n--){
                TreeNode &tn = *vn.front();
                vn.pop_front();
                if( tn.left ) cout << "<";
                cout << (char)tn.val << hex << "[" << tn.val << "]";
                if( tn.right ) cout << ">";
                cout << " ";
                if( tn.left!=NULL ) vn.push_back(tn.left);
                if( tn.right!=NULL ) vn.push_back(tn.right);
            }
            cout << endl;
        }
    }
};

int main(){
    TreeNode root('A');

    Codec cd;
    cd.build('x','z', &root);
    string s(cd.serialize(&root));

    cout << "Source TreeNode: \n";
    cd.print(root);
    cout << "Serialized: \n" << s << endl;
    cout << "And this deserialized: \n";
    root = *cd.deserialize(s);
    cd.print(root);

    TreeNode t1(1+'a');
    t1.left = new TreeNode(2+'a');
    t1.left->left = new TreeNode(255+256);
    string t1s( cd.serialize(&t1) );
    t1 = *cd.deserialize(t1s);
    cout << "Test t1:\n" << t1s << endl;
    cd.print(t1);

    return 0;
}

提交程序通过全部 47 项测试,LeetCode 确认接收,运行时间为 36ms,运行效率成绩为98%。

Serialize and Deserialize Binary Tree, 36 ms, beats 98%

参考资源

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值