转自:http://baike.baidu.com/view/7636.htm
Message Digest Algorithm MD5(中文名为消息摘要算法第五版)为计算机安全领域广泛使用的一种散列函数,用以提供消息的完整性保护。该算法的文件号为RFC 1321(R.Rivest,MIT Laboratory for Computer Science and RSA Data Security Inc. April 1992)。它的作用是让大容量信息在用数字签名软件签署私人密钥前被"压缩"成一种保密的格式(就是把一个任意长度的字节串变换成一定长的大整数)。
MD5最广泛被用于各种软件的密码认证和钥匙识别上。通俗的讲就是人们讲的序列号。
算法的应用
MD5的典型应用是对一段信息(Message)产生信息摘要(Message-Digest),以防止被篡改。比如,在UNIX下有很多软件在下载的时候都有一个文件名相同,文件扩展名为.md5的文件,在这个文件中通常只有一行文本,大致结构如:
MD5 (tanajiya.tar.gz) = 0ca175b9c0f726a831d895e269332461
算法描述
对MD5算法简要的叙述可以为:MD5以512位分组来处理输入的信息,且每一分组又被划分为16个32位子分组,经过了一系列的处理后,算法的输出由四个32位分组组成,将这四个32位分组级联后将生成一个128位散列值。
在MD5算法中,首先需要对信息进行填充,使其位长对512求余的结果等于448。因此,信息的位长(Bits Length)将被扩展至N*512+448,即N*64+56个字节(Bytes),N为一个非负整数,N可以是零。填充的方法如下,在信息的后面填充一个1和无数个0,直到满足上面的条件时才停止用0对信息的填充。然后,在这个结果后面附加一个以64位二进制表示的填充前信息长度。经过这两步的处理,现在的信息的位长=N*512+448+64=(N+1)*512,即长度恰好是512的整数倍。这样做的原因是为满足后面处理中对信息长度的要求。
MD5中有四个32位被称作链接变量(Chaining Variable)的整数参数,他们分别为:A=0x67452301,B=0xefcdab89,C=0x98badcfe,D=0x10325476。
当设置好这四个链接变量后,就开始进入算法的四轮循环运算。循环的次数是信息中512位信息分组的数目。
将上面四个链接变量复制到另外四个变量中:A到a,B到b,C到c,D到d。
主循环有四轮(MD4只有三轮),每轮循环都很相似。第一轮进行16次操作。每次操作对a、b、c和d中的其中三个作一次非线性函数运算,然后将所得结果加上第四个变量,文本的一个子分组和一个常数。再将所得结果向左环移一个不定的数,并加上a、b、c或d中之一。最后用该结果取代a、b、c或d中之一。
MD5算法的伪代码
//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating
var int[64] r, k //r specifies the per-round shift amounts
r[ 0..15]:= {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22}
r[16..31]:= {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}
r[32..47]:= {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}
r[48..63]:= {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21}
//Use binary integer part of the sines of integers as constants:
for i from 0 to 63
k[i] := floor(abs(sin(i + 1)) × 2^32)
//Initialize variables:
var int h0 := 0x67452301
var int h1 := 0xEFCDAB89
var int h2 := 0x98BADCFE
var int h3 := 0x10325476
//Pre-processing:
append "1" bit to message
append "0" bits until message length in bits ≡ 448 (mod 512)
append bit length of message as 64-bit little-endian integer to message
//Process the message in successive 512-bit chunks:
for each 512-bit chunk of message
break chunk into sixteen 32-bit little-endian words w[i], 0 ≤ i ≤ 15
//Initialize hash value for this chunk:
var int a := h0
var int b := h1
var int c := h2
var int d := h3
//Main loop:
for i from 0 to 63
if 0 ≤ i ≤ 15 then
f := (b and c) or ((not b) and d)
g := i
else if 16 ≤ i ≤ 31
f := (d and b) or ((not d) and c)
g := (5×i + 1) mod 16
else if 32 ≤ i ≤ 47
f := b xor c xor d
g := (3×i + 5) mod 16
else if 48 ≤ i ≤ 63
f := c xor (b or (not d))
g := (7×i) mod 16
temp := d
d := c
c := b
b := ((a + f + k[i] + w[g]) leftrotate r[i]) + b
a := temp
//Add this chunk's hash to result so far:
h0 := h0 + a
h1 := h1 + b
h2 := h2 + c
h3 := h3 + d
var int digest := h0 append h1 append h2 append h3
//(expressed as little-endian)
标准C语言实现
具体的一个MD5实现
/*
* md5 -- compute and check MD5 message digest.
* this version only can calculate the char string.
*
* MD5 (Message-Digest algorithm 5) is a widely used, partially
* insecure cryptographic hash function with a 128-bit hash value.
*
* Author: redraiment
* Date: Aug 27, 2008
* Version: 0.1.6
*/
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <math.h>
#define SINGLE_ONE_BIT 0x80
#define BLOCK_SIZE 512
#define MOD_SIZE 448
#define APP_SIZE 64
#define BITS 8
// MD5 Chaining Variable
#define A 0x67452301UL
#define B 0xEFCDAB89UL
#define C 0x98BADCFEUL
#define D 0x10325476UL
// Creating own types
#ifdef UINT64
# undef UINT64
#endif
#ifdef UINT32
# undef UINT32
#endif
typedef unsigned long long UINT64;
typedef unsigned long UINT32;
typedef unsigned char UINT8;
typedef struct
{
char * message;
UINT64 length;
}STRING;
const UINT32 X[4][2] = {{0, 1}, {1, 5}, {5, 3}, {0, 7}};
// Constants for MD5 transform routine.
const UINT32 S[4][4] = {
{ 7, 12, 17, 22 },
{ 5, 9, 14, 20 },
{ 4, 11, 16, 23 },
{ 6, 10, 15, 21 }
};
// F, G, H and I are basic MD5 functions.
UINT32 F( UINT32 X, UINT32 Y, UINT32 Z )
{
return ( X & Y ) | ( ~X & Z );
}
UINT32 G( UINT32 X, UINT32 Y, UINT32 Z )
{
return ( X & Z ) | ( Y & ~Z );
}
UINT32 H( UINT32 X, UINT32 Y, UINT32 Z )
{
return X ^ Y ^ Z;
}
UINT32 I( UINT32 X, UINT32 Y, UINT32 Z )
{
return Y ^ ( X | ~Z );
}
// rotates x left s bits.
UINT32 rotate_left( UINT32 x, UINT32 s )
{
return ( x << s ) | ( x >> ( 32 - s ) );
}
// Pre-processin
UINT32 count_padding_bits ( UINT32 length )
{
UINT32 div = length * BITS / BLOCK_SIZE;
UINT32 mod = length * BITS % BLOCK_SIZE;
UINT32 c_bits;
if ( mod == 0 )
c_bits = MOD_SIZE;
else
c_bits = ( MOD_SIZE + BLOCK_SIZE - mod ) % BLOCK_SIZE;
return c_bits / BITS;
}
STRING append_padding_bits ( char * argv )
{
UINT32 msg_length = strlen ( argv );
UINT32 bit_length = count_padding_bits ( msg_length );
UINT64 app_length = msg_length * BITS;
STRING string;
string.message = (char *)malloc(msg_length + bit_length + APP_SIZE / BITS);
// Save message
strncpy ( string.message, argv, msg_length );
// Pad out to mod 64.
memset ( string.message + msg_length, 0, bit_length );
string.message [ msg_length ] = SINGLE_ONE_BIT;
// Append length (before padding).
memmove ( string.message + msg_length + bit_length, (char *)&app_length, sizeof( UINT64 ) );
string.length = msg_length + bit_length + sizeof( UINT64 );
return string;
}
int main ( int argc, char *argv[] )
{
STRING string;
UINT32 w[16];
UINT32 chain[4];
UINT32 state[4];
UINT8 r[16];
UINT32 ( *auxi[ 4 ])( UINT32, UINT32, UINT32 ) = { F, G, H, I };
int roundIdx;
int argIdx;
int sIdx;
int wIdx;
int i;
int j;
if ( argc < 2 )
{
fprintf ( stderr, "usage: %s string .../n", argv[ 0 ] );
return EXIT_FAILURE;
}
for ( argIdx = 1; argIdx < argc; argIdx++ )
{
string = append_padding_bits ( argv[ argIdx ] );
// MD5 initialization.
chain[0] = A;
chain[1] = B;
chain[2] = C;
chain[3] = D;
for ( j = 0; j < string.length; j += BLOCK_SIZE / BITS)
{
memmove ( (char *)w, string.message + j, BLOCK_SIZE / BITS );
memmove ( state, chain, sizeof(chain) );
for ( roundIdx = 0; roundIdx < 4; roundIdx++ )
{
wIdx = X[ roundIdx ][ 0 ];
sIdx = 0;
for ( i = 0; i < 16; i++ )
{
// FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4.
// Rotation is separate from addition to prevent recomputation.
state[sIdx] = state [ (sIdx + 1) % 4 ] +
rotate_left ( state[sIdx] +
( *auxi[ roundIdx ] )
( state[(sIdx+1) % 4], state[(sIdx+2) % 4], state[(sIdx+3) % 4]) +
w[ wIdx ] +
(UINT32)floor( (1ULL << 32) * fabs(sin( roundIdx * 16 + i + 1 )) ),
S[ roundIdx ][ i % 4 ]);
sIdx = ( sIdx + 3 ) % 4;
wIdx = ( wIdx + X[ roundIdx ][ 1 ] ) & 0xF;
}
}
chain[ 0 ] += state[ 0 ];
chain[ 1 ] += state[ 1 ];
chain[ 2 ] += state[ 2 ];
chain[ 3 ] += state[ 3 ];
}
memmove ( r + 0, (char *)&chain[0], sizeof(UINT32) );
memmove ( r + 4, (char *)&chain[1], sizeof(UINT32) );
memmove ( r + 8, (char *)&chain[2], sizeof(UINT32) );
memmove ( r + 12, (char *)&chain[3], sizeof(UINT32) );
for ( i = 0; i < 16; i++ )
printf ( "%02x", r[i] );
putchar ( '/n' );
}
return EXIT_SUCCESS;
}
/* 以上程序可以在任意一款支持ANSI C的编译器上编译通过 */
/* 直接复制粘贴,请删除多余的空格,并调整格式,否则可能有编译错误 */
/* 在linux下编译,要添加链接库,命令如:gcc -o md5 md5.c -lm */
本文介绍了MD5算法的基本原理及其在信息安全领域的应用。MD5是一种常用的散列函数,用于确保数据完整性,通过生成固定长度的散列值来验证信息是否被篡改。此外,还提供了MD5算法的具体实现代码。
792

被折叠的 条评论
为什么被折叠?



