tb_content
(左)和tb_word
(右):
===================================== ================================
|id|sentence |sentence_id|content_id| |id|word|sentence_id|content_id|
===================================== ================================
| 1|sentence1| 0 | 1 | | 1| a | 0 | 1 |
| 2|sentence2| 1 | 1 | | 2| b | 0 | 1 |
| 3|sentence5| 0 | 2 | | 3| c | 1 | 1 |
| 4|sentence6| 1 | 2 | | 4| a | 1 | 1 |
| 5|sentence7| 2 | 2 | | 5| e | 1 | 1 |
===================================== | 6| f | 0 | 2 |
| 7| g | 1 | 2 |
| 8| h | 1 | 2 |
| 9| i | 1 | 2 |
|10| f | 2 | 2 |
|11| h | 2 | 2 |
|12| f | 2 | 2 |
================================
我需要检查每个句子是否包含每个句子中其他句子所拥有的单词content_id
。
例如 :
检查content_id
=1
他们是sentence1
和sentence2
。从tb_word
,我们可以看出sentence1
和sentence2
由同一个词组成a
。如果a
两个句子中的个数是>=2
,那么a
就是结果。因此,如果我打印结果,它必须是:
00Array ( [0] => a [1] => b) 01Array ( [3] => a ) 10Array ( [3] => a )11Array ( [0] => c [1] => a [2] => e)
where00
表示sentence_id
=0
和sentence_id
=0
首先,我计算每个拥有 functionTotal
多少:sentence
content_id
$total = array();
$sql = mysql_query('select content_id, count(*) as RowAmount
from tb_content Group By contente_id') or die(mysql_error());
while ($row = mysql_fetch_array($sql)) {
$total[] = $row['RowAmount'];
}
return $total;
从那个函数我得到 和 的值$total
,我需要检查一些单词(from tb_word
)在 2 的所有可能性之间的相似性sentence
foreach ($total as $content_id => $totals){
for ($x=0; $x <= ($totals-1); $x++) {
for ($y=0; $y <= ($totals-1); $y++) {
$shared = getShared($x, $y);
}
}
的功能getShared
是:
function getShared ($x, $y){
$token = array();
$shared = array();
$i = 0;
if ($x == $y) {
$query = mysql_query("SELECT word FROM `tb_word`
WHERE sentence_id ='$x' ");
while ($row = mysql_fetch_array($query)) {
$shared[$i] = $row['word'];
$i++;
}
} else {
$query = mysql_query("SELECT word, count(word) as jml
FROM `tb_word` WHERE sentence_id ='$x'
OR sentence_id ='$y'
GROUP BY word ");
while ($row = mysql_fetch_array($query)) {
$jml = $row['jml'];
$token[$i] = $row['word'];
if ($jml >= 2) {
$shared[$i] = $token[$i];
}
$i++;
}
但是我得到的结果仍然是错误的。结果还是混在了不同的地方content_id
。结果也必须分组content_id
。对不起我糟糕的英语和糟糕的解释。cmiiw,请帮助我..谢谢:)