5

I have database with words and phrases from for exp. English to 15 other languages, and also for every language in that list to other 15. For one pair they are sort for now in one table like this (en -> de):

  • id_pair
  • word_en
  • word_de

What is the best way to create database for that huge list of words and phrases? I know that I must separate every primary language from others, and was thinking maybe like this:

ENGLISH
ID | WORD
1  | 'dictionary'

GERMAN
ID | WORD
1  | 'lexikon'
2  | 'wörterbuch'

TRANSLATION_EN_DE
ID_EN | ID_DE
1     | 1
1     | 2

Is this the best way to normalize DB? But what is with phrases, I need also if someone enter word "dictionay" that this returns also "This dictionary is good" and translation for that. (I know this can find in first table with sql query, is that best way?)

Also need it alphabetically all time, I will have lot of new entry daily, so I can print couple words before and after the word/phases someone looking for translate.

I'm stuck and cant decide what is the best way to optimize it. These db have all together more than 15gb just text based translation, and around 100k daily req, so every ms worth. :) Any help will be appreciate, thx!

4

1 回答 1

7

每种语言都有单独的表,您需要大量的联结表来涵盖所有可能的翻译组合。最重要的是,添加新语言需要添加更多表、重写查询、客户端代码等。

最好以更通用的方式进行,类似于:

在此处输入图像描述

关于 TRANSLATION 表,我建议也创建一个CHECK (WORD_ID1 < WORD_ID2)并创建一个索引 {WORD_ID2, WORD_ID1}(与 PK 相反的“方向”),并仅用一行表示翻译的两个方向。

如果您的 DBMS 支持,请考虑对 TRANSLATION 表进行集群。

也一直需要它按字母顺序

查询...

SELECT * FROM WORD WHERE LANGUAGE_ID = :lid ORDER BY WORD_TEXT

...可以使用 UNIQUE 约束 {LANGUAGE_ID, WORD_TEXT} 下的索引。

于 2013-06-04T13:57:51.027 回答