【Learning】Distributed System - Serialization

本文深入探讨了序列化与反序列化的过程、机制及其在分布式数据处理中的应用,包括不同序列化机制的选择与使用场景,以及序列化容器文件格式的设计与实现。重点介绍了序列化在进程间通信与持久化存储中的作用,同时对比了RMI与RPC在分布式对象调用方面的区别,强调了序列化机制对于保持对象状态一致性的重要性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Serialization is the process of turning structured objects into a byte stream for transmissionover a network or for writing to persistent storage. Deserialization is the reverseprocess of turning a byte stream back into a series of structured objects. Serialization appears in two quite distinct areas of distributed data processing: for interprocess communicationand for persistent storage.

Serialization mechanism should consider following:

  1. Primitive type serialization

       a. big-endian or little-endian

       b. one-byte character or two-byte character

       c. binary or textual

       d. etc

  2. Constructed type serialization

       a. the order of each included primitive types or objects

Typical object serialization mechanisms include Java Serialization, Hadoop Writable, Thrift, Avro, ProtocalBuffer, etc.

Serialized Object Container

One primitive value or constructed type value can be viewed as an object. For interprocess communication, the unit of a byte steam is an object (as arguments or return values). But for persistent storage, storing every single object into a file is not efficient. Normally, we need to store a sequence of objects into one file. Therefore, a object container file format is needed, Avro datafile, SequenceFile are such file formats. 

Typical object container file formats include Avro datafile, SequenceFile, etc.

The object container file format can use different types serializer to do the object serialization. For example, SequenceFile normally use Hadoop Writable as its key/value. But it can also use other serializer like Avro to serialize key/value.

RMI vs RPC

RMI is different from RPC in that it is object-oriented. The core concept is distributed object invocation. In Java RMI, when an object implements Remote interface(by default Serializable), it is regarded as a remote object, which means that it will be passed by reference (remote object referemce) across JVMs. If it only implements Serializable but not Remote interface, it will be passed by value across JVMs. The benefit of remote object invocation is that the client and the server always see the same object states.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值