2

我想获得这种方法的更复杂的版本。我的目标只是从表中获取 1-n 个单词组合,我在其中逐行存储单个单词。词组应该按照 id 顺序来构建,而不是随机顺序。

目前我使用 mysql 变量但问题来了,当我尝试获得大于 12 个单词的组合时,因为 mysql 在内部使用的 JOIN 表存在限制。此外,我还没有想出如何一次获得所有组合。结果我只得到“word1 word2”、“word3 word4”或“word1 word2 word3”、“word4 word5 word6”,而不是“word1 word2”、“word1 word2 word3”、“word1 word2 word3 word4”等。

SELECT  (@word3 := CONCAT(@word3, ' ', w.word)) AS word3,
    (@word2 := CONCAT(@word2, ' ', w.word)) AS word2,
    (@word3 := @word2) _word3,
    (@word2 := w.word) _word2,
    w.book_id

FROM    (SELECT @word3:='') _word3,
    (SELECT @word2:='') _word2,
    word w, text t, chapter c, verse v

WHERE   v.number IN (14) AND c.number IN (21) AND c.book_id IN (1) AND t.verse_id = v.id AND w.id = t.word_id AND v.chapter_id = c.id

章节,诗句和文本用于显示更大的上下文,但基本上单词表是:

id, 
word, 
numeric_value, 
transliteration, 
translation, 
words_count, 
book_id

我认为加入表限制是因为我还想连接音译,字数和其他字段。

我也在选择上试过这个:

if(@word3, @word3 := CONCAT(@word3, ' ', w.word), @word3 := '') AS word3,

if(@word2, @word2 := CONCAT(@word2, ' ', w.word), @word2 := '') AS word2,

但由于某种原因它没有工作。

那么从表中获取相关单词组合的最佳方法是什么,其中单词位于不同的行上?

示例字表数据:

this
is
a
phrase
containing
words
on
separate
rows

示例输出:

this is, is a, a phrase, phrase containing, containing words, words on, 
on separate, separate rows, this is a, is a phrase, a phrase containing, 
phrase containing words, containing words on, words on separate, 
on separate rows, ...

依此类推,有 4 到 n 个单词组合。

4

2 回答 2

0

使用您当前的架构,您可以使用GROUP_CONCAT()

SELECT   GROUP_CONCAT(w.word)
FROM     chapter c
    JOIN verse   v ON v.chapter_id = c.id
    JOIN text    t ON t.verse_id   = v.id
    JOIN word    w ON w.id         = t.word_id
WHERE    c.book_id = 1
     AND c.number  = 21
     AND v.number  = 14
GROUP BY c.book_id, c.number, v.number

然而,这似乎有点过于规范化了。如果合适的话,为什么不将完整的文本存储在text带有FULLTEXT索引的表中呢?

于 2013-10-18T07:38:11.513 回答
0

此查询基于以下方法将CSV值解析为多行:http ://www.sqlteam.com/article/parsing-csv-values-into-multiple-rows

  1. 它将从字符串字段中提取每个单词
  2. 将 1 到 4 个单词的每个组合存储在标签列中
  3. 验证标签的第一个和最后一个单词的长度 (>2)

FOREACH 表

pos

1
2
3
4
5
6
7
.
.
.
1000

单词表

row_id   phrase

1        an example of phrase to tag with many words
2        hello world!
3        another phrase 1000 characters long

MYSQL 查询

SELECT 
DISTINCT
 @col1_len:=IF(num=1,@col2_len,@col1_len) AS col1_len
,@col1:=IF(num=1,@col2,@col1) AS col1
,@col2_len:=IF(num=1,@col3_len,@col2_len) AS col2_len
,@col2:=IF(num=1,@col3,@col2) AS col2
,@col3_len:=IF(num=1,@col4_len,@col3_len) AS col3_len
,@col3:=IF(num=1,@col4,@col3) AS col3
,@col4_len:=IF(num=1,LOCATE(' ',content, f.pos ) - f.pos,@col4_len) AS col4_len
,@col4:=SUBSTRING(content, f.pos,@col4_len) AS col4
,CASE
WHEN f.num=1
THEN 
   @tag:=@col4
WHEN f.num=2
THEN
   @tag:=TRIM(CONCAT(@col3,' ',@col4))
WHEN f.num=3
THEN 
   @tag:=TRIM(CONCAT(@col2,' ',@col3,' ',@col4))
WHEN f.num=4
THEN 
   @tag:=TRIM(CONCAT(@col1,' ',@col2,' ',@col3,' ',@col4))
END AS tag
,CASE
WHEN f.num=1
THEN 
   @valid_tag:=IF(@col4_len > 2,1,0)
WHEN f.num=2
THEN
   @valid_tag:=IF(@col3_len > 2,1,0)
WHEN f.num=3
THEN 
   @valid_tag:=IF(@col2_len > 2,1,0)
WHEN f.num=4
THEN 
   @valid_tag:=IF(@col1_len > 2,1,0)
END AS valid_tag
FROM 
(SELECT 
CONCAT(' ' ,phrase, ' ') AS content
,f1.pos AS pos
,f2.pos AS num 
,@col1:=''
,@col2:=''
,@col3:=''
,@col4:=''
,@col1_len:=0
,@col2_len:=0
,@col3_len:=0
,@col4_len:=0
FROM words
LEFT JOIN foreach f1
ON ( f1.pos < content_len+1 )
LEFT JOIN foreach f2 
ON ( f2.pos < 5 )
WHERE 1=1
AND row_id IN (1) 
)f 
WHERE 1=1
AND SUBSTRING(content, f.pos - 1, 1)=' '
HAVING 
LENGTH(tag)>2
AND valid_tag=1
AND col4_len > 2

示例输出

tag

example
example of phrase
phrase
phrase to tag
phrase to tag with
tag
tag with
tag with many
tag with many words
with
with many
with many words
many
many words
words
于 2013-11-11T13:19:01.000 回答