神的恩赐

本文介绍了针对Linux内核中的内存操作函数如memcpy、memmove等进行的重大优化工作。通过避免内存假依赖和减少指令解码时间等手段,显著提升了在不同CPU架构上的性能,部分场景下性能提升高达3倍。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

希望优快云的编辑,将这个博客推荐到首页,非常了不起的成果。以下全文转载,来自Maling。


 The comment from Linus is “The code looks clever and nice”!

 

a.       memcpy in Linux kernel

Patch: https://patchwork.kernel.org/patch/296282/

commit id: 59daa706fbec745684702741b9f5373142dd9fdc

First completely avoid memory false dependence in CPU pipeline, which impacts all x86 CPU, the performance is improved up to 3X, pushed into Linux kernel release version, and replaced original one, which stayed for 8 years.

 

b.      memmove in Linux kernel

Patch: http://lkml.org/lkml/2010/9/16/502

commit id: 3b4b682becdfa9f42321aa024d5cc84f71f06d8c

Avoid long latency and some limitation from mov string instruction, which cost much time in decoding stage, and memory false dependence for unaligned cases.

   

     H.J and I provide the below codes.

 

a.       64bit memcpy/memmove for Atom, Core2 and Core i7

http://article.gmane.org/gmane.comp.lib.glibc.alpha/15278

This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and

Core i7.  It improves memcpy up to 3X on Atom, up to 4X on Core 2 and

up to 1X on Core i7.  It also improves memmove by up to 3X on Atom, up to

4X on Core 2 and up to 2X on Core i7.

 

b.      64bit memcmp for Core i7

http://sourceware.org/ml/libc-alpha/2010-04/msg00030.html

This is 64bit SSE4 optimized memcmp. It improves memcmp by up to 3Xon Intel Core i7.

c.       64bit strcmp

http://sources.redhat.com/ml/libc-alpha/2009-07/msg00063.html

The code is checked in glibc and opensolaris library.

 

d.      64bit strcpy

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/strcpy.s

The code is checked in glibc and opensolaris library.

 

e.       32bit memset/memcpy for Atom, Core2 and Corei7

http://sources.redhat.com/ml/libc-alpha/2010-01/msg00016.html

Their performances are all improved up to 3x~4x, pushed into moblin libc successfully.

 

f.       32bit memcmp/strcmp/strncmp for Atom, Core2 and Corei7.

http://sourceware.org/ml/libc-alpha/2010-02/msg00028.html

                  The patch is to provide 32bit memcmp/strcmp/strncmp optimized for

SSSE3/SSS4.2.  It can improve memcmp by up to 3X, strcmp by up to 7x

 

本文来自优快云博客,转载请标明出处:http://blog.youkuaiyun.com/pennyliang/archive/2011/03/30/6288471.aspx

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值