php - MySQL 推荐，字段 vs 关系表

Question

我有一个 MySQL INNODB 表，其中包含大约 2,000,000 行和 10 个字段（表“汽车”）。它将继续以目前每年约 500,000 行的速度递增。这是一个繁忙的表，平均每秒 24/7 次获取不同类型的查询。

现在的情况是我需要扩展信息以包含一个 INT 字段（“country_id”）。但是，对于所有行的至少 99%，此字段将默认为“1”。

我的问题是：是否有任何具体原因可以执行以下任一解决方案：

将 INT 字段添加到表中并为其编制索引（“cars”.“country_id”）
添加一个包含字段“car_id”和“country_id”的关系表（“car_countries”）

我在测试环境中设置了这些示例，对表中的数据进行了几千次迭代以找出答案：

数据库/表大小将由于索引增加 19 % (~21 MB)
查询将平均延长 16%（0.37717 秒 vs 0.32431 秒，每个 1,000 个查询）

我以前曾尝试为所有字段填充适当的信息，并添加了关系表，其中表需要非强制性信息，但现在我读到只要不需要排列数据，这几乎没有什么好处（MySQL 不处理（而 PostgreSQL 处理））在表中。在我的示例中，特定汽车永远不会出售给 2 个国家/地区，因此永远不需要为特定汽车添加更多国家/地区。

使用解决方案 1，几乎所有事情都变得更容易，因为磁盘空间并不重要。无论如何我还应该考虑解决方案2吗？如果是这样，为什么？

此致，

/托马斯

score 1 · Accepted Answer

The theoretical answer is that option 1 reflects your underlying relationships - a car can be sold to only one country, and therefore a "many to many" relationship (which option 2 suggests) is not appropriate. It would confuse future developers, and pollutes the data model.

The pragmatic answer is that option 2 doesn't appear to have a dramatic performance improvement today, and - crucially - it's likely to introduce complexity into your code. If 99% of the queries don't need the country data, you either have to write the query to include it (thus negating the performance benefit), or build nasty "if I need country THEN query = xxx ELSE query = yyy" logic.

Finally, apropos the indexing question - MySQL only uses one index for a query, so unless you're writing a query where "country" is in the where clause or being joined on, it's unlikely to have an impact.

score 0 · Accepted Answer

感谢 bwoebi、Raphaël Althaus、AgRizzo、Alfons 和 Ed Gibbs 对问题的投入！

简短的摘要：

由于一辆车不能有两个国家，并且只需要一个额外的字段：
- 使用解决方案 1
此外，可能不需要索引，请检查我们在特定场景下的基数和性能

/托马斯

php - MySQL 推荐，字段 vs 关系表

2 回答 2

Related

Reference