最近开始学习应用密码,决定先从SM4下手,前后加起来学了有1个月,在老师的指导下昨天刚刚实现完成初步的实现,现特来总结一下,有不足之处还望多加指点
基本思想:
本算法是一个分组算法。该算法的分组长度为 128 比特,密钥长度为 128 比特。加密算法与密钥扩展算法都采用 32 轮非线性迭代结构。解密算法与加密算法的结构相同,只是轮密钥的使用顺序相反,解密轮密钥是加密轮密钥的逆序。
术语和定义:
分组长度(block length):一个信息分组的比特位数。
密钥长度(key length):密钥的比特位数。
密钥扩展算法(key expansion algorithm):将密钥变换为轮密钥的运算单元。
轮数(rounds):轮函数的迭代次数。
字(word):长度为32比特的组(串)。
S盒(S-box):S盒位固定的8比特输入8比特输出的置换,记为Sbox();
符号:
⨁ 32比特异或 (C中符号为 ^)
⋘i 32比特循环左移i位
密钥及密钥参数:
加密密钥长度为128比特,表示为为MK=(MK0,MK1,MK2,MK3),其中MKi(i=0,1,2,3)为字。
轮密钥表示为(rk0,rk1,·········,rk31),其中rki(i=0,·········,31)为比特字,轮密钥由加密密钥生成。
FK=(FK0,FK1,FK2,FK3)为系统参数。
CK=(CK0,CK1,·········,CK3)为固定参数,用于密钥扩展算法,其中FKi(i=0,·········,3)、CKi(i=0,·········,31)为字。
轮函数F:
设输入为(X0,X1,X2,X3)∈(Z232)4,轮密钥为rk∈Z232,则轮函数F为:
F(X0,X1,X2,rk)=X0⨁T(X1⨁X2⨁X3⨁rk).
合成置换T:
T:Z232→Z232是一个可逆变换,由线性变换τ和线性变换L复合而成,即T(·)=L(τ(·))
(1)非线性变换τ:
τ由4个并行的S盒构成。
设输入为A=(a0,a1,a2,a3)∈(Z28)4,输出为B=(b0,b1,b2,b3)∈(Z28)4,
则(b0,b1,b2,b3)=τ(A)=(Sbox(a0),Sbox(a1),Sbox(a2),Sbox(a3))
其中Sbox数据如下:
//SBox
const char sBox[256] = {
0xd6,0x90,0xe9,0xfe,0xcc,0xe1,0x3d,0xb7,0x16,0xb6,0x14,0xc2,0x28,0xfb,0x2c,0x05,
0x2b,0x67,0x9a,0x76,0x2a,0xbe,0x04,0xc3,0xaa,0x44,0x13,0x26,0x49,0x86,0x06,0x99,
0x9c,0x42,0x50,0xf4,0x91,0xef,0x98,0x7a,0x33,0x54,0x0b,0x43,0xed,0xcf,0xac,0x62,
0xe4,0xb3,0x1c,0xa9,0xc9,0x08,0xe8,0x95,0x80,0xdf,0x94,0xfa,0x75,0x8f,0x3f,0xa6,
0x47,0x07,0xa7,0xfc,0xf3,0x73,0x17,0xba,0x83,0x59,0x3c,0x19,0xe6,0x85,0x4f,0xa8,
0x68,0x6b,0x81,0xb2,0x71,0x64,0xda,0x8b,0xf8,0xeb,0x0f,0x4b,0x70,0x56,0x9d,0x35,
0x1e,0x24,0x0e,0x5e,0x63,0x58,0xd1,0xa2,0x25,0x22,0x7c,0x3b,0x01,0x21,0x78,0x87,
0xd4,0x00,0x46,0x57,0x9f,0xd3,0x27,0x52,0x4c,0x36,0x02,0xe7,0xa0,0xc4,0xc8,0x9e,
0xea,0xbf,0x8a,0xd2,0x40,0xc7,0x38,0xb5,0xa3,0xf7,0xf2,0xce,0xf9,0x61,0x15,0xa1,
0xe0,0xae,0x5d,0xa4,0x9b,0x34,0x1a,0x55,0xad,0x93,0x32,0x30,0xf5,0x8c,0xb1,0xe3,
0x1d,0xf6,0xe2,0x2e,0x82,0x66,0xca,0x60,0xc0,0x29,0x23,0xab,0x0d,0x53,0x4e,0x6f,
0xd5,0xdb,0x37,0x45,0xde,0xfd,0x8e,0x2f,0x03,0xff,0x6a,0x72,0x6d,0x6c,0x5b,0x51,
0x8d,0x1b,0xaf,0x92,0xbb,0xdd,0xbc,0x7f,0x11,0xd9,0x5c,0x41,0x1f,0x10,0x5a,0xd8,
0x0a,0xc1,0x31,0x88,0xa5,0xcd,0x7b,0xbd,0x2d,0x74,0xd0,0x12,0xb8,0xe5,0xb4,0xb0,
0x89,0x69,0x97,0x4a,0x0c,0x96,0x77,0x7e,0x65,0xb9,0xf1,0x09,0xc5,0x6e,0xc6,0x84,
0x18,0xf0,0x7d,0xec,0x3a,0xdc,0x4d,0x20,0x79,0xee,0x5f,0x3e,0xd7,0xcb,0x39,0x48
};
(2)线性变换L
非线性变换τ的输出是线性变换L的输入。设输入为B∈Z232,输出为C∈Z232,
则C=L(B)=B⨁(B<<<2)⨁(B<<<10)⨁(B<<<18)⨁(B<<<24)
加密算法:
本加密算法由32次迭代运算和1次反序变换R组成
设明文输入为(X0,X1,X2,X3)∈(Z232)4,密文输出为(Y0,Y1,Y2,Y3)属于(Z232)4,
轮秘钥为rki∈Z232,i=0,1,·········,31,加密算法的运算过程如下:
(1)32次迭代运算:Xi+4=F(Xi,Xi+1,Xi+2,Xi+3,rki),i=0,1,·········31
(2)反序变换(Y0,Y1,Y2,Y3)=R(X32,X33,X34,X35)=(X35,X34,X33,X32)
解密算法:
解密变化与加密变化过程结构基本相同,唯一不同之处在于轮密钥的使用顺序,解密时使用顺序为(rk31,rk30,·········,rk0)
密钥扩展算法:
轮密钥由加密密钥通过密钥扩展算法生成
加密密钥MK=(MK0,MK1,MK2,MK3∈(Z232)4,轮密钥生成方法为:
(K0,K1,K2,K3)=(MK0⨁FK0,MK1⨁FK1,MK2⨁FK2,MK3⨁FK3),
rki=Ki+4=Ki⨁T’(Ki+1⨁Ki+2⨁Ki+3⨁CKi),i=0,1,·········,31
其中:
(1)T’是将合成置换T的线性变换L替换为L’:L’(B)=B⨁(B<<<13)⨁(B<<<23);
(2)系统参数FK的取值为:
FK0 = (A3B1BAC6) ,FK2 = (56AA3350) ,FK3 = (677D9197) ,FK4 = (B27022DC) ;
//系统参数
const long FK[4] = {
0xa3b1bac6,
0x56aa3350,
0x677d9197,
0xb27022dc
};
(3)固定参数CK的取值方法为:
设cki,j为CKi的第j字节(i=0,1,·········,31;j=0,1,2,3),即CKi=(cki,0,cki,1,cki,2,cki,3)∈(Z28)4,则cki,j=(4i+j)*7(mod256)
固定参数CKi(i=0,1,·········,31)具体值为:
00070E15, 1C232A31, 383F464D, 545B6269,
70777E85, 8C939AA1, A8AFB6BD, C4CBD2D9,
E0E7EEF5, FC030A11, 181F262D, 343B4249,
50575E65, 6C737A81, 888F969D, A4ABB2B9,
C0C7CED5, DCE3EAF1, F8FF060D, 141B2229,
30373E45, 4C535A61, 686F767D, 848B9299,
A0A7AEB5, BCC3CAD1, D8DFE6ED, F4FB0209,
10171E25, 2C333A41, 484F565D, 646B7279
//固定参数
const long CK[32] = {
0x00070e15,0x1c232a31,0x383f464d,0x545b6269,
0x70777e85,0x8c939aa1,0xa8afb6bd,0xc4cbd2d9,
0xe0e7eef5,0xfc030a11,0x181f262d,0x343b4249,
0x50575e65,0x6c737a81,0x888f969d,0xa4abb2b9,
0xc0c7ced5,0xdce3eaf1,0xf8ff060d,0x141b2229,
0x30373e45,0x4c535a61,0x686f767d,0x848b9299,
0xa0a7aeb5,0xbcc3cad1,0xd8dfe6ed,0xf4fb0209,
0x10171e25,0x2c333a41,0x484f565d,0x646b7279
};
概念及原理部分到此结束,下面附上代码
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
#include<string.h>
#define ul unsigned long
#define uc unsigned char
#define dataSize 1048576
三个工具函数,别人那搬来的,用于数据类型转换和实现循环左移
//unsignle char转long
void uCharToLong(uc *in, ul *out) {
int i = 0;
*out = 0;
for (i = 0; i < 4; i++)
*out = ((ul)in[i] << (24 - i * 8)) ^ *out;
}
//unsingle long转char
void uLongToChar(ul in,uc *out) {
int i = 0;
//从32位ul的高位开始取
for (i = 0; i < 4; i++)
*(out + i) = (ul)(in >> (24 - i * 8));
}
//循环左移
ul moveLeft(ul data,int length) {
ul result = 0;
result = (data<<length)^(data>>(32-length));
return result;
}
将加密密钥进行S盒的非线性变换以及线性变换,求出轮密钥
//密钥处理方法 S盒非线性变换 L->L'线性变换
ul fun_Key(ul input) {
uc ucList[4] = {0}; //临时数组,存储4字节
ul ulList = 0; //临时ul,存储S盒转换后的字符串
uc ucSBoxList[4] = {0};
uLongToChar(input,ucList);
for(int i=0; i<4; i++) {
ucSBoxList[i] = sBox[ucList[i]]; //S盒非线性变换
}
uCharToLong(ucSBoxList,&ulList);
ulList = ulList^moveLeft(ulList,13)^moveLeft(ulList,23); //线性变换
return ulList;
}
处理明文,和上面差不多,公式运用不同
//加密数据处理方法 同上
ul fun_data(ul input) {
uc ucList[4] = {0};
ul ulList = 0;
uc ucSBoxList[4] = {0};
uLongToChar(input,ucList);
for(int i=0; i<4; i++) {
ucSBoxList[i] = sBox[ucList[i]];
}
uCharToLong(ucSBoxList,&ulList);
ulList = ulList^moveLeft(ulList,2)^moveLeft(ulList,10)^moveLeft(ulList,18)^moveLeft(ulList,24); //公式
return ulList;
}
之前每次调用此函数时,都要动态分配一次内存空间,malloc消耗了不少时间,经老师指点改用直接将指针指向的方式
//加密函数
void enc_code(ul length,uc *key, uc *input, uc *output) {
//uc *p = (uc*)malloc(1048576); //动态分布内存比较耗时, 密码算法运算时采用预先分配的空间,不要频繁的malloc
uc *p= input; //将输入数据放入缓存区, 用指针指向input
ul ulKeyTempList[4] = {0}; //存放转换类型后的密钥
ul ulKeyList[36] = {0}; //存放与FK运算后,扩展生成算法后的密钥
ul ulDataList[36] = {0};
uCharToLong(key,&ulKeyTempList[0]); //分4块进行转换,每块4字节
uCharToLong(key+4,&ulKeyTempList[1]);
uCharToLong(key+8,&ulKeyTempList[2]);
uCharToLong(key+12,&ulKeyTempList[3]);
ulKeyList[0] = ulKeyTempList[0]^FK[0];
ulKeyList[1] = ulKeyTempList[1]^FK[1];
ulKeyList[2] = ulKeyTempList[2]^FK[2];
ulKeyList[3] = ulKeyTempList[3]^FK[3];
//生成rk,执行32次迭代运算
for(int i=0; i<32; i++) {
ulKeyList[i+4] = ulKeyList[i]^fun_Key(ulKeyList[i+1]^ulKeyList[i+2]^ulKeyList[i+3]^CK[i]);
}
//不足16位的补0
for(int i=0; i<16-(int)length-16; i++) {
*(p+length+i) = 0;
}
for(int j=0; j<length/16+((length%16)?1:0); j++) {
//处理加密数据
uCharToLong(p+16*j,&(ulDataList[0]));
uCharToLong(p+16*j+4,&(ulDataList[1]));
uCharToLong(p+16*j+8,&(ulDataList[2]));
uCharToLong(p+16*j+12,&(ulDataList[3]));
//加密
for(int i=0; i<32; i++) {
ulDataList[i+4] = ulDataList[i]^fun_data(ulDataList[i+1]^ulDataList[i+2]^ulDataList[i+3]^ulKeyList[i+4]);
}
//输出加密数据
uLongToChar(ulDataList[35],output+16*j);
uLongToChar(ulDataList[34],output+16*j+4);
uLongToChar(ulDataList[33],output+16*j+8);
uLongToChar(ulDataList[32],output+16*j+12);
}
}
//解密函数 流程与加密相同,唯一区别在反用密钥
void dec_code(uc length,uc *key, uc *input, uc *output) {
}
主函数进行调用,主要测试的是加密的耗时以及数据吞吐率
int main() {
uc i;
ul length;
uc *enc_result;//[1048576] = {0}; //加密输出缓存区
uc dec_result[5000] = {0}; //解密输出缓存区
ul beginTime,endTime;
ul loops = 1000;
uc *plaintText;
//明文
uc plaintText1[16] = { 0x01,0x23,0x45,0x67,0x89,0xab,0xcd,0xef,0xfe,0xdc,0xba,0x98,0x76,0x54,0x32,0x10 };
//测试向量可以动态生成
plaintText=(uc*)malloc(dataSize);
enc_result = (uc*)malloc(dataSize);
for(int i=0;i< dataSize/16;i++)
{
memcpy( plaintText+i*16, plaintText1,16);
}
//加密密钥
uc key[16] = { 0x01,0x23,0x45,0x67,0x89,0xab,0xcd,0xef,0xfe,0xdc,0xba,0x98,0x76,0x54,0x32,0x10 };
length = 16*(dataSize/16)+16*((dataSize%16)?1:0);
//加密
beginTime = clock();
for(int i=1; i<=loops; i++) {
printf("%d ",i);
enc_code(dataSize,key,plaintText,enc_result);
}
endTime = clock();
endTime = endTime-beginTime;
printf("\n");
// printf("加密后的数据是:\n");
// for(int i=0;i<length;i++){
// printf("%x ",*(enc_result+i));
// }
printf("\n");
ul totalSize = length*8*loops;
printf("数据量为:%lu bits\n",totalSize);
printf("用时 = %lus\n", endTime/CLOCKS_PER_SEC);
printf("吞吐率为:%lubps\n",(totalSize/(endTime/ CLOCKS_PER_SEC)));
system("pause");
return 0;
}
1M的数据重复加密100次的话大概能达到这样一个效果,但上升到1000次则结果不太理想,后期有待改进