Byte order

Big Endian 和 Little Endian [转] powerpc 发表于 2005-5-5 15:25:00 谈到字节序的问题,必然牵涉到两大CPU派系。那就是Motorola的PowerPC系列CPU和Intel的x86系列CPU。PowerPC系列采用big endian方式存储数据,而x86系列则采用little endian方式存储数据。那么究竟什么是big endian,什么又是 little endian呢? 其实big endian是指低地址存放最高有效字节(MSB),而little endian则是低地址存放最低有效字节(LSB)。 用文字说明可能比较抽象,下面用图像加以说明。比如数字0x12345678在两种不同字节序CPU中的存储顺序如下所示: Big Endian 低地址 高地址 -----------------------------------------> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 12 | 34 | 56 | 78 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Little Endian 低地址 高地址 -----------------------------------------> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 78 | 56 | 34 | 12 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 从上面两图可以看出,采用big endian方式存储数据是符合我们人类的思维习惯的。而little endian,!@#$%^&*,见鬼去吧 -_-||| 为什么要注意字节序的问题呢?你可能这么问。当然,如果你写的程序只在单机环境下面运行,并且不和别人的程序打交道,那么你完全可以忽略字节序的存在。但是,如果你的程序要跟别人的程序产生交互呢?在这里我想说说两种语言。C/C++语言编写的程序里数据存储顺序是跟编译平台所在的CPU相关的,而 JAVA编写的程序则唯一采用big endian方式来存储数据。试想,如果你用C/C++语言在x86平台下编写的程序跟别人的JAVA程序互通时会产生什么结果?就拿上面的0x12345678来说,你的程序传递给别人的一个数据,将指向0x12345678的指针传给了JAVA程序,由于JAVA 采取big endian方式存储数据,很自然的它会将你的数据翻译为0x78563412。什么?竟然变成另外一个数字了?是的,就是这种后果。因此,在你的C程序传给JAVA程序之前有必要进行字节序的转换工作。 无独有偶,所有网络协议也都是采用big endian的方式来传输数据的。所以有时我们也会把big endian方式称之为网络字节序。当两台采用不同字节序的主机通信时,在发送数据之前都必须经过字节序的转换成为网络字节序后再进行传输。ANSI C中提供了下面四个转换字节序的宏。 ·BE和LE一文的补完 我在8月9号的《Big Endian和Little Endian》一文中谈了字节序的问题,原文见上面的超级链接。可是有朋友仍然会问,CPU存储一个字节的数据时其字节内的8个比特之间的顺序是否也有big endian和little endian之分?或者说是否有比特序的不同? 实际上,这个比特序是同样存在的。下面以数字0xB4(10110100)用图加以说明。 Big Endian msb lsb ----------------------------------------------> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Little Endian lsb msb ----------------------------------------------> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 实际上,由于CPU存储数据操作的最小单位是一个字节,其内部的比特序是什么样对我们的程序来说是一个黑盒子。也就是说,你给我一个指向0xB4这个数的指针,对于big endian方式的CPU来说,它是从左往右依次读取这个数的8个比特;而对于little endian方式的CPU来说,则正好相反,是从右往左依次读取这个数的8个比特。而我们的程序通过这个指针访问后得到的数就是0xB4,字节内部的比特序对于程序来说是不可见的,其实这点对于单机上的字节序来说也是一样的。 那可能有人又会问,如果是网络传输呢?会不会出问题?是不是也要通过什么函数转换一下比特序?嗯,这个问题提得很好。假设little endian方式的CPU要传给big endian方式CPU一个字节的话,其本身在传输之前会在本地就读出这个8比特的数,然后再按照网络字节序的顺序来传输这8个比特,这样的话到了接收端不会出现任何问题。而假如要传输一个32比特的数的话,由于这个数在littel endian方存储时占了4个字节,而网络传输是以字节为单位进行的,little endian方的CPU读出第一个字节后发送,实际上这个字节是原数的LSB,到了接收方反倒成了MSB从而发生混乱。
### Byte Order: Host and Network Byte Order In networking and programming, the term "byte order" refers to the sequence in which bytes are arranged within a multi-byte data type such as integers or floating-point numbers. The two primary byte orders are **big-endian** and **little-endian**, as described in the provided references[^1]. However, when discussing `byte order=host` and `byte order=0`, these terms typically relate to how systems handle byte ordering in network communication. #### Byte Order = Host The host byte order refers to the native byte order of the system on which the program is running. This could either be little-endian (e.g., Intel x86 architecture) or big-endian (e.g., SPARC, PowerPC). Programs that operate on a single machine often use the host's native byte order for efficiency because no conversion is required. However, this can lead to issues in networked environments where machines with different byte orders communicate with each other. For example: - On a little-endian machine, an integer value `0x12345678` would be stored in memory as `78 56 34 12`. - On a big-endian machine, the same integer would be stored as `12 34 56 78`. When working with host byte order, it is important to consider whether the data being processed will eventually need to be transmitted over a network, where a standardized byte order is typically used[^2]. #### Byte Order = 0 (Network Byte Order) Network byte order is a standard defined for data transmission across networks. It is always **big-endian**, meaning the most significant byte (MSB) comes first in the sequence. This ensures consistency across all devices communicating over the network, regardless of their native byte order. In many programming languages, especially those commonly used in networking (e.g., C/C++), functions exist to convert between host byte order and network byte order. For instance: - `htons()` (Host-to-Network Short): Converts a 16-bit integer from host byte order to network byte order. - `htonl()` (Host-to-Network Long): Converts a 32-bit integer from host byte order to network byte order. - `ntohs()` (Network-to-Host Short): Converts a 16-bit integer from network byte order to host byte order. - `ntohl()` (Network-to-Host Long): Converts a 32-bit integer from network byte order to host byte order. Here is an example of converting between host and network byte orders in C: ```c #include <stdio.h> #include <arpa/inet.h> int main() { uint16_t host_short = 0x1234; uint32_t host_long = 0x12345678; uint16_t net_short = htons(host_short); uint32_t net_long = htonl(host_long); printf("Host short: 0x%hx -> Network short: 0x%hx\n", host_short, net_short); printf("Host long: 0x%x -> Network long: 0x%x\n", host_long, net_long); return 0; } ``` This code demonstrates how values are converted between host and network byte orders using the provided functions. #### Key Differences 1. **Definition**: Host byte order refers to the native byte order of the system, while network byte order is always big-endian. 2. **Usage**: Host byte order is used for internal processing within a single machine, whereas network byte order is used for data transmission over networks to ensure compatibility. 3. **Conversion**: Functions like `htons()`, `htonl()`, `ntohs()`, and `ntohl()` are necessary when switching between host and network byte orders. --- ###
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值