mysql - 从/向大表中获取/插入大量数据

Question

我们有一个在线评委（类似于SPOJ.pl），我们在周末进行这些长达 3 小时的比赛，到最后我们有近 1000 份提交。我们将所有这些运行存储在一个表上（包括提交的代码）。该表的当前结构如下：

+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| rid        | int(11)  | NO   | PRI | NULL    | auto_increment |
| pid        | int(11)  | YES  |     | NULL    |                |
| tid        | int(11)  | YES  |     | NULL    |                |
| language   | tinytext | YES  |     | NULL    |                |
| name       | tinytext | YES  |     | NULL    |                |
| code       | longtext | YES  |     | NULL    |                |
| time       | tinytext | YES  |     | NULL    |                |
| result     | tinytext | YES  |     | NULL    |                |
| error      | text     | YES  |     | NULL    |                |
| access     | tinytext | YES  |     | NULL    |                |
| submittime | int(11)  | YES  |     | NULL    |                |
| output     | longtext | YES  |     | NULL    |                |
+------------+----------+------+-----+---------+----------------+

现在的问题是，每次我们在ORDER BY其中查询时使用该子句，它最终都会对整个表进行排序。如果超过 1000 行，每行都包含大量数据，则所花费的时间很重要。请注意，这是在OPTIMIZE表格定期表示对提交内容进行了更改之后。我们确实有两种选择：

在说大约 100 个条目后拆分表格。
将大量数据（提交的代码）存储为文件，而不是将它们作为值插入表中以减少开销。

如果我们实际上可以保持表结构原样，是否还有另一种选择/解决方法？我真的可以在这里使用一些帮助。谢谢。

score 0 · Accepted Answer

My recommendation would be to do something called vertical partitioning: split the table into multiple tables, with different columns.

In this case, I would have one table that has all the small data: rid, pid, tid, language, name, time, result, access, submittime.

A second table would have: rid, code, error, output.

This way, you can do the sort on the first table and then join in the other fields after the sort. I put code, error, and output together since they sort of seem to go together.

mysql - 从/向大表中获取/插入大量数据

1 回答 1

Related

Reference