源码如下:
#include <string.h>
#include <stdlib.h>
#undef strlen
#ifndef STRLEN
# define STRLEN strlen
#endif
/* Return the length of the null-terminated string STR. Scan for
the null terminator quickly by testing four bytes at a time. */
size_t
STRLEN (const char *str)
{
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, himagic, lomagic;
/* Handle the first few characters by reading one character at a time.
Do this until CHAR_PTR is aligned on a longword boundary. */
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
/* All these elucidatory comments refer to 4-byte longwords,
but the theory applies equally well to 8-byte longwords. */
longword_ptr = (unsigned long int *) char_ptr;
/* Bits 31, 24, 16, and 8 of this number are zero. Call these bits
the "holes." Note that there is a hole just to the left of
each byte, with an extra at the end:
bits: 01111110 11111110 11111110 11111111
bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD
The 1-bits make sure that carries propagate to the next 0-bit.
The 0-bits provide holes for carries to fall into. */
himagic = 0x80808080L;
lomagic = 0x01010101L;
if (sizeof (longword) > 4)
{
/* 64-bit version of the magic. */
/* Do the shift in two steps to avoid a warning if long has 32 bits. */
himagic = ((himagic << 16) << 16) | himagic;
lomagic = ((lomagic << 16) << 16) | lomagic;
}
if (sizeof (longword) > 8)
abort ();
/* Instead of the traditional loop which tests each character,
we will test a longword at a time. The tricky part is testing
if *any of the four* bytes in the longword in question are zero. */
for (;;)
{
longword = *longword_ptr++;
if (((longword - lomagic) & ~longword & himagic) != 0)
{
/* Which of the bytes was the zero? If none of them were, it was
a misfire; continue the search. */
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
if (sizeof (longword) > 4)
{
if (cp[4] == 0)
return cp - str + 4;
if (cp[5] == 0)
return cp - str + 5;
if (cp[6] == 0)
return cp - str + 6;
if (cp[7] == 0)
return cp - str + 7;
}
}
}
}
libc_hidden_builtin_def (strlen)
剖析:1、下面这段代码解决的问题是 内存对齐 及 在内存对齐之前就出现字符串结尾标志 \0 的处理.
剖析:
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
什么是内存对齐?它有什么好处?
之前网上有一个例子,如果一个变量int 的起始地址偏移是3,那么CPU要取这个地址上的数据,需要取两次,为什么呢?
假设一个变量 在内存的位置 从地址 1开始存放数据,因为这个是int类型,它占用4个字节的内存空间。
我们用一个int 的模子「int模子是4个字节」来卡这个数据,实际上是这样操作的,第一次卡模子,只能从0开始>
第二次卡模子,再从4位置开始
从图片上可以明显看出来,我们需要CPU卡两次模子,才取到在内存里面的 int 变量
如果int 是按照内存对齐的方式存放的呢?
很明显,我们只需要卡一次模子就可以取到数据了。
内存对齐可以提升CPU效率,我们知道单字节对齐的效率应该是最低的。我们可以使用宏#pragma pack()来指定内存对齐的方式。
在win10 64操作系统中,使用Qt得到unsigned long int
占用4
个字节的内存空间
如果这个字符串str的首地址是4的倍数,则可加快处理速度。
再看下面这段代码:
((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0
sizeof(longword)
为 4,则sizeof (longword) - 1)
为3
3的二进制码为 00000000 00000000 00000000 00000011
char_ptr为一个指针,这个地址本质上也是一串2进制的数
后面的代码就好理解了,如果内存没有对齐,就判断char_ptr是否为\0,如果不是字符串的结尾,就对char_ptr进行自增运算。然后继续判断自增之后的char_ptr是否为4的倍数(即是否内存对齐),如果在内存对齐之前就已经存在\0了,直接返回char_ptr - str就是字符串的长度。
2、内存对齐之后的操作
longword_ptr = (unsigned long int *) char_ptr;
在内存对齐的情况下,上面的这行代码强制类型转换是安全的。
3、高低位魔法数字himagic和lomagic的说明
himagic = 0x80808080L;
lomagic = 0x01010101L;
if (sizeof (longword) > 4)
{
/* 64-bit version of the magic. */
/* Do the shift in two steps to avoid a warning if long has 32 bits. */
himagic = ((himagic << 16) << 16) | himagic;
lomagic = ((lomagic << 16) << 16) | lomagic;
}
if (sizeof (longword) > 8)
abort ();
himagic = 0x80808080L; => 10000000 10000000 10000000 10000000
lomagic = 0x01010101L; => 00000001 00000001 00000001 00000001
himagic = ((himagic << 16) << 16) | himagic;
lomagic = ((lomagic << 16) << 16) | lomagic;
上面这两行代码是对64位操作系统的扩展支持。
4、如果在内存对齐之前都没检测到\0,则进入后面的for(;;)循环
for(;;)
{
longword = *longword_ptr++; // 获取longword_ptr对应的unsigned long int 变量的值
if (((longword - lomagic) & ~longword & himagic) != 0)
{
......
}
}
longword - lomagic : 判断longword中每个字节的最高位:如果longword中任何一个字节等于0,那么相减后最高位为由0变成1。其他的(小于128并且大于0)最高位则不变(依然为0)。如下图所示,只要longword有一个字节为0,那么相减之后的结果该字节对应的最高位就会变成1。
longword - lomagic 的结果连续跟~longword和himagic进行与运算之后的结果如下图所示:
因此这个if判断语句(((longword - lomagic) & ~longword & himagic) != 0)要表达的意思就是看longword 的4个字节中是否存在全0的字节,即字符串结尾的标志\0。
当找到这个全0字节的位置后,if里面的代码就很好理解了;如果这个longword 没有找到全0的字节,通过if语句上面的这条语句
longword = *longword_ptr++;来继续for循环,在后面紧接着的4个字节里面继续查找,直到找到字符串结尾的标志\0。
注意:if (((longword - lomagic) & ~longword & himagic) != 0)这个判断语句这么写还有一个好处,当longword中有字节的值超过128时,该判断依然有效,代码的鲁棒性更强。
为便于理解下面这个简单的代码运行起来就可以看效果了:
#include <stdio.h>
int main(){
unsigned long int himagic = 0x80808080;
unsigned long int lomagic = 0x01010101;
unsigned long int longWord1 = 0xF1010067;
unsigned long int longWord2 = 0xF1580067;
unsigned long int longWord3 = 0xF158D367;
unsigned long int result1 = (longWord1 - lomagic) & ~longWord1 & himagic;
unsigned long int result2 = (longWord2 - lomagic) & ~longWord2 & himagic;
unsigned long int result3 = (longWord3 - lomagic) & ~longWord3 & himagic;
printf("result1 = 0x%p\n", result1);
printf("result2 = 0x%p\n", result2);
printf("result3 = 0x%p\n", result3);
return 0;
}
结果如下图所示
本文内存对齐部分参考:
https://zhuanlan.zhihu.com/p/84387738
其余部分参考:
https://blog.youkuaiyun.com/astrotycoon/article/details/8124359