Few months ago, I introduced a simple algorithmthat allow users to implement their ownshort URLinto theirsystem. Today,I have some spare time so I decided to write the short URL algorithm'simplementation in PHP.
At first, we define a function calledshorturl()thatreceives a URL as the input andreturnsanarray that contains 4 hashed values (each 6 characters).
function shorturl($input) {
...
//returnarray of results
}
Below is the original pseudocode:
...
loop2: from 1st 4 bytes to 4th 4 bytes of md5 result
cast the 4 bytes to aninteger
loop3: for shortCodeChar[0] to shortCodeChar[5]
use 1st 5 bits of the integer to find the value in codeMap
remove 5 bits from the integer
end loop3
save shortCodeChar as shortCode
...
// Databasecheckingfor duplication
end loop2
...
The following code is written according to thealgorithm above excluding the databasecheckingpart for duplication:
function shorturl($input) {
$base32 = array (
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z', '0', '1', '2', '3', '4', '5'
);
$hex = md5($input);
$hexLen = strlen($hex);
$subHexLen = $hexLen / 8;
$output = array();
for ($i = 0; $i < $subHexLen; $i++) {
$subHex = substr ($hex, $i * 8, 8);
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex));
$out = '';
for ($j = 0; $j < 6; $j++) {
$val = 0x0000001F & $int;
$out .= $base32[$val];
$int = $int >> 5;
}
$output[] = $out;
}
return $output;
}
Samplecode to test/use theabove function:
$input = 'http://www.snippetit.com/1';
$output = shorturl($input);
echo "Input : $input\n";
echo "Output : {$output[0]}\n";
echo " {$output[1]}\n";
echo " {$output[2]}\n";
echo " {$output[3]}\n";
echo "\n";
$input = 'http://www.snippetit.com/2';
$output = shorturl($input);
echo "Input : $input\n";
echo "Output : {$output[0]}\n";
echo " {$output[1]}\n";
echo " {$output[2]}\n";
echo " {$output[3]}\n";
echo "\n";
Output:
Input : http://www.snippetit.com/1
Output : h0xg4r
bdr3tw
osk2d3
4azfqa
Input : http://www.snippetit.com/2
Output : tm5kxb
ceoj2s
yw3dvl
nrmrxl
The functionreturnan array of 4elements, youcan use any one of them. The others can be used asalternativeunique code for the input when youfound a duplicated code in your database (same code but different input -although it is unlikely to happen but it will happen). Chances to get aduplicated code is about n/(32^6) or n/1,073,741,824 where n is the number ofrecordsin your database.
As you can see, the output results are quiterandom although you only have one character different in the input string. Theoutput is always consistent, for the same input you will always get the sameoutput.
To make the output more unpredictable by theothers, you can scramble the values in the$base32array or/and add inyour ownprivatekeyor/and XOR the value of$valwith a value from range0 to 31.
Forexampletoscramble the values in the$base32array, you canchange thepositionof the values or/and replace the value with another (make sure the replacedvalue is URL safe character).
Forexampleto add inprivatekey, you can add in additional stringwhen calling themd5()function, e.g.:
$hex = md5('my-secret-key'.$input.'my-another-secret-key');
Forexampleto XOR the value of$valwith valueof 18:
$out .= $base32[$val ^ 18];
1718

被折叠的 条评论
为什么被折叠?



