1

有人知道任何算法可以为给定的字符串生成唯一的 8 位或 9 位数字吗?如果没有的话,最好有一个php示例,那么至少是算法。

4

4 回答 4

3

您可以使用crc32()并返回 len。

<?php 
function crc_string($str, $len){
    return substr(sprintf("%u", crc32($str)),0,$len);
}
echo crc_string('some_string', 8);//65585849
?>

编辑

在对我的答案进行碰撞/可靠性测试后,您很可能会遇到长度为 8 的碰撞,对于 9 可能会稍微少一些,然后对于 10 等来说甚至更少。在我的测试中,我测试了一个从 0 到 100k 的递增值,并且发生了 26 次碰撞,第一次发生了 36k。

<?php 
set_time_limit(0);
header('Content-type: text/html; charset=utf-8');
$time_start = microtime(true);

function crc_string($str, $len){
    return substr(sprintf("%u", crc32($str)),0,$len);
}

echo 'Started, please wait...<br />';
$record = array();
$collisions = 0;
for($i=0; $i<100000;$i++){

    $new = crc_string($i, 8);
    if(in_array($new,$record)){
        $match = array_search($new,$record);
        $took_time = microtime(true) - $time_start;
        echo($new.' has collided for iteration '.$i.' matching against a previous iteration ('.$match.') '.$record[$match]).' (Process time: '.round($took_time,2).'seconds)<br />';
        $collisions++;
    }else{
        $record[]=$new;
    }

    ob_flush();
    flush();
}
echo 'Successfully iterated 100k incrementing values and '.$collisions.' collisions occurred; total processing time: '.round((microtime(true) - $time_start),2).'seconds.';
?>

测试结果:

Started, please wait...
38862356 has collided for iteration 36084 matching against a previous iteration (8961) 38862356 (Process time: 165.47seconds)
18911887 has collided for iteration 36887 matching against a previous iteration (8162) 18911887 (Process time: 172.79seconds)
37462269 has collided for iteration 38245 matching against a previous iteration (33214) 37462269 (Process time: 185.81seconds)
20153794 has collided for iteration 38966 matching against a previous iteration (6083) 20153794 (Process time: 192.87seconds)
41429622 has collided for iteration 40329 matching against a previous iteration (24999) 41429622 (Process time: 206.41seconds)
20784356 has collided for iteration 48908 matching against a previous iteration (27095) 20784356 (Process time: 302.75seconds)
39932561 has collided for iteration 51926 matching against a previous iteration (12367) 39932561 (Process time: 340.88seconds)
14372225 has collided for iteration 53032 matching against a previous iteration (13211) 14372225 (Process time: 355.46seconds)
16636457 has collided for iteration 55490 matching against a previous iteration (39250) 16636457 (Process time: 389.44seconds)
23059743 has collided for iteration 63126 matching against a previous iteration (39808) 23059743 (Process time: 504.1seconds)
13627299 has collided for iteration 63877 matching against a previous iteration (21973) 13627299 (Process time: 516.08seconds)
24647738 has collided for iteration 63973 matching against a previous iteration (47328) 24647738 (Process time: 517.62seconds)
14471815 has collided for iteration 71118 matching against a previous iteration (37805) 14471815 (Process time: 641.93seconds)
13253269 has collided for iteration 73602 matching against a previous iteration (33064) 13253269 (Process time: 687.53seconds)
10732050 has collided for iteration 73706 matching against a previous iteration (9197) 10732050 (Process time: 689.44seconds)
18919349 has collided for iteration 80358 matching against a previous iteration (73190) 18919349 (Process time: 819.89seconds)
40795042 has collided for iteration 81875 matching against a previous iteration (31127) 40795042 (Process time: 851.3seconds)
14609922 has collided for iteration 82498 matching against a previous iteration (17366) 14609922 (Process time: 864.29seconds)
20425272 has collided for iteration 83914 matching against a previous iteration (9858) 20425272 (Process time: 894.32seconds)
24790147 has collided for iteration 84519 matching against a previous iteration (9754) 24790147 (Process time: 907.34seconds)
35605337 has collided for iteration 91434 matching against a previous iteration (36127) 35605337 (Process time: 1060.5seconds)
30935494 has collided for iteration 91857 matching against a previous iteration (91704) 30935494 (Process time: 1070.17seconds)
28520037 has collided for iteration 92929 matching against a previous iteration (28847) 28520037 (Process time: 1095.53seconds)
31109474 has collided for iteration 95584 matching against a previous iteration (30349) 31109474 (Process time: 1159.36seconds)
40842617 has collided for iteration 97330 matching against a previous iteration (13609) 40842617 (Process time: 1203.19seconds)
20309913 has collided for iteration 99224 matching against a previous iteration (94210) 20309913 (Process time: 1250.54seconds)
Successfully iterated 100k incrementing values and 26 collisions occurred; total processing time: 1269.98seconds.

结论是,除非您对自动递增值进行 1 比 1 递增,否则在填充用户表时,您总是会遇到相同字节长度的冲突,甚至更多:

echo sprintf("%08d",'1');//00000001
echo sprintf("%08d",'2');//00000002
...                      //99999999

您可以通过向碰撞值添加另一个字节或包含 az 范围(如 md5()/sha() 哈希函数)来解决此问题,但这会破坏对象;p

祝你好运

于 2012-08-21T11:55:53.383 回答
1

是的,会发生碰撞,但是由于您没有说明为什么需要它,因此假设碰撞无关紧要。

您可以获取字符串的 md5 哈希(以十六进制表示)并将其转换为我们的数字系统并将其截断为所需的数字。

这可能对您有帮助: php: number only hash?

于 2012-08-21T11:55:25.630 回答
0

10^9唯一的 9 位数字,而每个长度都有256^length字符串(假设为 ascii 字符串)。

因此,根据鸽笼原理- 对于长度为 4+ 的字符串,您无法获得唯一编号。(必须发生碰撞)

作为替代方案 - 您可能正在寻找传统的哈希函数(它们会发生冲突)或使用无界数字。

于 2012-08-21T11:52:04.963 回答
0

正如已经指出的,如果数字的位数少于您要关联的字符串,则“唯一性”是不可能的。

您正在寻找的是一个好的散列函数。

查看MD6 算法。它具有高达 512 位的可自定义摘要长度,因此您可以创建具有 8 - 9 位十进制数字的摘要。我不知道任何 PHP 实现,原始实现语言是 C。

于 2012-08-21T12:00:42.960 回答