SSE4 Instruction Set

SSE4指令集于2006年9月27日正式发布,并于2007年初应用于Intel及AMD处理器中。SSE4分为SSE4.1、SSE4.2与SSE4a三个版本,共包含54条指令,其中SSE4.1拥有47条,SSE4.2有7条,而SSE4a由AMD提出,新增6条位操作指令。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

SSE4 — An Overview

SSE4 was formally announced on September 27th, 2006, and became available in hardware in early 2007 for both Intel and AMD processors. Earlier hints were available, but were incomplete (old versions of this page were based on such reports).

SSE4 now comes in 3 flavors: SSE$.1, SSE4.2, and SSE4a. All together, there are 54 instructions, 47 of which belong to SSE4.1, the remaining 7 belonging to SSE4.2. SSE4a is from AMD (who didn't support all the SSE4 instructions), and adds 6 instructions for bit manipulation.

SSE4 — The Instructions

SSE4.1
mpsadbw - Sum of absolute differences.
phminposuw - minimum+index extraction (16bit word).
pmuldq - packed multiply.
pmulld - packed multiply.
dpps - dot product, single precision.
dppd - dot product, double precision.
blendps - conditional copy.
blendpd - conditional copy.
blendvps - conditional copy.
blendvpd - conditional copy.
pblendvb - conditional copy.
pblendw - conditional copy.
pminsb - packed minimum signed byte.
pmaxsb - packed maximum signed byte.
pminuw - packed minimum unsigned word.
pmaxuw - packed maximum unsigned word.
pminud - packed minimum unsigned dword.
pmaxud - packed maximum unsigned dword.
pminsd - packed minimum signed dword.
pmaxsd - packed maximum signed dword.
roundps - packed round single precision float to integer.
roundss - scalar round single precision float to integer.
roundpd - packed round double precision float to integer.
roundsd - scalar round double precision float to integer.
inserps - complex data shuffling.
pinsrb - complex data shuffling.
pinsrd - complex data shuffling.
pinsrq - complex data shuffling.
extractps - complex data shuffling.
pextrb - complex data shuffling.
pextrw - complex data shuffling.
pextrd - complex data shuffling.
pextrq - complex data shuffling.
pmovsxbw - packed sign extension.
pmovzxbw - packed zero extension.
pmovsxbd - packed sign extension.
pmovzxbd - packed zero extension.
pmovsxbq - packed sign extension.
pmovzxbq - packed zero extension.
pmovxswd - packed sign extension.
pmovzxwd - packed zero extension.
pmovsxwq - packed sign extension.
pmovzxwq - packed zero extension.
pmovsxdq - packed sign extension.
pmovzxdq - packed zero extension.
ptest - same as test, but for sse registers.
pcmpeqq - quadword compare for equality.
packusdw - saturating signed dwords to unsigned words.
movntdqa - Non-temporal aligned move.

SSE4.2
crc32 - CRC32C function (using 0x11edc6f41 as the polynomial).
pcmpestri - Packed compare explicit length string, Index.
pcmpestrm - Packed compare explicit length string, Mask.
pcmpistri - Packed compare implicit length string, Index.
pcmpistrm - Packed compare implicit length string, Mask.
pcmpgtq - Packed compare, greater than.
popcnt - Population count.

SSE4a
lzcnt - Leading Zero count.
popcnt - Population count.
extrq - Mask-shift operation.
inserq - Mask-shift operation.
movntsd - Non-temporal double precision move.
movntss - Non-temporal single precision move
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值