mysql - 套套套套？或者，为集合集实施版本控制

Question

我正在开发一个用于质量控制清单的网络应用程序。我已经建立了一个表格，但我有一种预感，我们的模型是次优的，我可以获得更好的性能。请不要说我正在使用 mysql，所以我仅限于它的功能。

每个清单都有几十个，有时甚至是数百个问题。每个问题有 2 到 10 个可能的答案。每个问题都是一个 varchar 字符串，每个答案也是如此。完整的清单是当所有问题与其可能的答案之一相关联时 - 选择一个答案。

清单因不同的目的而有所不同，并且会随着时间而改变。因此，当我们想要更改新的清单时，为了防止已完成的清单不小心更改，我们有模板。模板、问题和答案是清单、问题和答案的镜像，代表清单的“当前版本”。

所以表层次结构看起来像这样

. 客户

模板
- 模板问题
  - 模板问题答案
清单
- 清单问题
  - 清单问题解答

因为我们不希望当前模板中的更改“回到过去”并更改已完成的清单，所以当用户开始新的清单时，数据会从模板复制到清单中。

你可以猜到，这会产生很多重复。在 ChecklistQuestionAnswers 中，在大约一百万行答案中，只有 4,000 个不同的答案。当然，TemplatesQuestionAnswers 也有重复，但没有那么糟糕。

所以我想我想做的是为清单模板创建一个版本控制系统，这样我就可以通过只存储具有唯一答案集的唯一问题来节省空间。这样，我就可以将清单与模板版本链接起来，而不是大量复制文本，然后清单集就是为哪个问题选择了哪个答案。

这是我到目前为止所勾勒出来的。

一个客户有很多模板。一个模板有许多修订版，但只有一个当前修订版。每个修订版都有很多问题，每个问题都有很多（2 到 10 个）答案。每个清单与一个模板相关。每个清单都有一组答案，指示在其模板版本中为每个问题选择的答案。

Questions /* all unique question wordings */
Questions.id
Questions.question

Answers /* all unique answer wordings. */
Answers.id
Answers.answer 

Templates 
Templates.client_id /* relates to client table. */
Templates.template_name 
Templates.current_version /* this is related to TemplateVersions.version_number */

TemplateVersions /* A logical grouping of a set of questions and answers */
TemplateVersions.version
TemplateVersions.template_id /* relates this version to a template. */

TemplateQuestions
TemplateQuestions.template_version /* relates a question to a template version */
TemplateQuestions.question_id /* relates a unique question to this template version */
TemplateQuestions.id

TemplateQuestionAnswers
TemplateQuestionAnswers.template_question_id /* relates this answer to a particular template version question */
TemplateQuestionAnswers.answer_id /* relates the unique question to a unique answer */
TemplateQuestionAnswers.id

Checklists
Checklists.id
Checklists.template_version /* relates this question to a template version -- associating this checklist to a client happens through this relationship */

ChecklistAnswers /* ( I might call this something other than 'Answers' since the lack of ChecklistQuestionAnswers breaks 'name symmetry' with TemplateQuestionAnswers ) */
ChecklistAnswers.checklist_id 
ChecklistAnswers.question_id
ChecklistAnswers.answer_id

我被挂断的问题是保证 ChecklistAnswers 关联正确的问答对——存在于它的 Checklist 父级引用的模板版本中的关系。

换句话说，ChecklistAnswers 中的每一行都必须将来自 TemplateQuestions 的 question_id“镜像”到来自 TemplateQuestionAnswers 的一个子问题，形成 Checklists 中的 template_version。我正在尝试思考如何做到这一点，而我的思维过程在这里短路。这实际上是数据库的“可交付成果”——一份完整的清单——所以所有其他模板和一切都是附带的或抽象的。如果我不能让这个工作，我就错过了重点！

这似乎有点笨拙，所以我想知道我是否正在制定一个解决方案，其复杂性不值得我通过实施它节省空间。

另请注意，我已经简化了一点。还有其他方面的复杂性，例如用于对报告问题进行分组的类别系统，但我认为我们不需要在这里深入探讨。

score 1 · Accepted Answer

据我所理解：

您正在做的一个简单改进可能是使用 3 个表作为模板，仅使用 2 个表作为实际清单：清单（使用的模板版本的外键）答案（清单的外键，templateAnswer 的外键）

因此，如果您要检索特定清单的答案列表，您将：

select  <whatever columns you like>
from checklist c, answer a, templateAnswer ta, templateQuestion tq
where  a.checklist_id = c.id AND a.ta_id = ta.id AND ta.tq_id = tq.id AND
c.id = <something>

附言。如果问题共享答案，而且在很多情况下它们可能会共享答案（想到“是”、“否”），您可以有一个用于唯一答案的表：templateAnswers 和一个表 templateAnswerUsage（模板答案的外键和 templateQuestion 的外键）。这样你就没有重复的答案文本。问题和答案之间本质上是多对多的关系。这可能有意义，也可能没有意义，具体取决于答案的平均大小是否大于您将使用的 ID 大小。

mysql - 套套套套？或者，为集合集实施版本控制

1 回答 1

Related

Reference