1

我有一列存储用户的简历/标题。它是用户自定义编写的,可以有尽可能多的单词。

id title
1  Business Development Executive Cold Calling & Cold Emailing expert Entrepreneur
2  Director of Online Marketing and entrepreneur
3  Art Director and Entrepreneur 
4  Corporate Development at Yahoo!
5  Snr Program Manager, Yahoo 

我试图找出一个显示词频的 mysql 查询:

Entrepreneur 3
development  2
director     2 

我知道如果我可以将值中的每个单词作为单独的行返回,那么我可以使用正常的分组。我已经查看但找不到将文本拆分为单独一行中的单词的函数。

可以做到吗?

4

2 回答 2

4

您可以通过加入用于挑选第 n 个单词的制造数字系列来做到这一点。不幸的是,mysql没有生成序列的内置方法,所以有点难看,但这里是:

select
  substring_index(substring_index(title, ' ', num), ' ', -1) word,
  count(*) count
from job j
join (select 1 num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 10 union select 11 union select 12) n
on length(title) >= length(replace(title, ' ', '')) + num - 1
group by 1
order by 2 desc

在 SQLFiddle 上查看使用您的数据并产生预期输出的现场演示。

可悲的是,必须对数字系列的每个值进行硬编码的限制也限制了将要处理的列的字数(在本例中为 12)。系列中的数字是否过多并不重要,您始终可以添加更多以覆盖更大的预期输入文本。

于 2013-10-17T10:07:10.183 回答
0

尝试选择所有职位并将其作为数组返回。然后在php中做这样的事情:

<?php
$array = array("Business Development Executive Cold Calling & Cold Emailing expert  Entrepreneur ", "Director of Online Marketing and entrepreneur", "Art Director and Entrepreneur", "Corporate Development at Yahoo!", "Snr Program Manager, Yahoo");
$words = "";
foreach($array as $val) $words .= " ".strtolower($val);
print_r(array_count_values(str_word_count($words, 1)));
?>

将输出:

Array ( [business] => 1 [development] => 2 [executive] => 1 [cold] => 2 [calling] => 1 [emailing] => 1 [expert] => 1 [entrepreneur] => 3 [director] => 2 [of] => 1 [online] => 1 [marketing] => 1 [and] => 2 [art] => 1 [corporate] => 1 [at] => 1 [yahoo] => 2 [snr] => 1 [program] => 1 [manager] => 1 )
于 2013-10-16T02:21:27.503 回答