0

我正在尝试搜索整个表格并在长字符串中返回出现次数最多的短语(最多三个单词)。我相信我可以使用全文搜索,但我没有匹配任何东西......

桌子

I like Iron Man 3 so much
Iron Man 3 sucked alot
Iron Man really saved the day
I like cats
cats are cool

结果

Iron Man
Iron Man 3
cats

询问

SELECT *
    FROM table
    WHERE substring(text, up to 3 words) OCCURS MOST
    ORDER BY OCCURRENCE DESC
4

1 回答 1

0

如果您真的对此感兴趣,我会说您应该将文本(非sql)解析为一个名为 word_list 的表,例如

create table phrases (word1 varchar, word2 varchar, word3 varchar, cnt int);

和代码:

$q = query("select comment from comments");
while ($row = array_read_line($q)){

$words = preg_split('/\s/', $row['comment']);
$previous1 = false;
$previous2 = false;

foreach($words as $word){
     if($previous1 and $previous2){
        .. here comes quoting, security, mysql-injection-safety, min length
        query("update relations set cnt = cnt+1 "
            . " where word1 = '$previous1', word2 = '$previous2', word3='$word'" ) 
        if (rows_afected == 0){
            query("insert into relations "
            . " set cnt = 1, word1 = '$previous1', "
            . " word2 = '$previous2', word3='$word'" ) 
        }
    }
    previous1 = $previous2;
    $previous2 = $word;
}

}

然后按count desc排序。

于 2013-05-07T20:01:02.070 回答