3

场景的简要概述:

我们在船上有一个数据记录系统,其中各种传感器读取实时数据并将该数据存储在 MySQL 数据库中。

每个传感器都有一个表格,其中瞬时传感器值带有时间戳并存储在数据库中。

现在的要求是将所有传感器的数据合并到一个表中,并在两个日期时间值之间平均每分钟的值。

这是我到目前为止所做的:

1.创建了一个存储过程来创建一个日历表。 日历过程创建一个表,其中包含两个指定日期时间值之间的日期时间戳。对于巡航报告,我正在处理日历表,如下所示:

cal
-------------------+
dt            
-------------------+
2012-07-09 00:00:00
2012-07-09 00:01:00
2012-07-09 00:02:00

... etc

2012-07-29 23:57:00
2012-07-29 23:58:00
2012-07-29 23:59:00

总共 30241 条记录,在 0.016 秒内获取,所以没有问题。

2. 为每分钟平均的传感器值创建临时表。

平均传感器表示例:

tbl_gyro_hdt_1min_ave
-------------------+------------------
tmstamp            | average_heading
-------------------+------------------
2012-07-09 00:00:00, 135.633333333333
2012-07-09 00:01:00, 135.633333333333
2012-07-09 00:02:00, 136.1
2012-07-09 00:03:00, 135.433333333333
etc...

29546 records fetched in 0.047 secs

和另一个传感器表:

tbl_par_sensor_1min_ave
-------------------+------------------
tmstamp            | average_par
-------------------+------------------
2012-07-09 00:00:00, 16.269949
2012-07-09 00:01:00, 16.270832
2012-07-09 00:02:00, 16.2637752
2012-07-09 00:03:00, 16.2678025
2012-07-09 00:04:00, 16.269324
2012-07-09 00:05:00, 16.2721382
etc...

29543 records fetched in 0.047 secs

3. 现在将临时表连接到日历表是轮子脱落的地方。

要将单个表加入日历表,我这样做:

 SELECT cal.dt, tbl_gyro_hdt_1min_ave.average_heading
    FROM cal

    LEFT JOIN tbl_gyro_hdt_1min_ave
    ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp  

解释上述查询:

+----+---------------+-----------------------+--------+---------------+-------+---------+------+-------+-------------+
| Id |  Select_Type  |  Table                |  Type  | Possible_Keys | Key   | Key_Len | Ref  | Rows  | Extra       |
+----+---------------+-----------------------+--------+---------------+-------+---------+------+-------+-------------+
| 1  |  SIMPLE       | cal                   |  index | NULL          | dt    | 9       | NULL | 30243 | Using index |
| 1  |  SIMPLE       | tbl_gyro_hdt_1min_ave |  ALL   | date_index    | NULL  | NULL    | NULL | 29546 |             |
+----+---------------+-----------------------+--------+---------------+-------+---------+------+-------+-------------+

对于非常小的数据集,这可以正常工作,但对于上面的示例,它只是挂起。我试图为所有表添加索引,结果相同。

编辑> 我让整个数据集在一夜之间运行。

结果:

获取 30243 条记录。

持续时间:23.697 秒,获取时间为 3000.352 秒

下一步是在日历表中加入两个以上的表,如下所示:

 SELECT cal.dt, tbl_par_sensor_1min_ave.average_par, tbl_gyro_hdt_1min_ave.average_heading
    FROM tbl_par_sensor_1min_ave

    LEFT JOIN cal
    ON cal.dt = tbl_par_sensor_1min_ave.tmstamp

    LEFT JOIN tbl_gyro_hdt_1min_ave
    ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp

毫不奇怪,这也挂起。

任何指针将不胜感激。

根据以下评论中的要求,以下是表模式:

show columns from cal;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| dt    | datetime | YES  | MUL | NULL    |       |
+-------+----------+------+-----+---------+-------+
1 row in set (0.00 sec)


show columns from  tbl_gyro_hdt_1min_ave;
+-----------------+-------------+------+-----+---------+-------+
| Field           | Type        | Null | Key | Default | Extra |
+-----------------+-------------+------+-----+---------+-------+
| tmstamp         | varchar(24) | YES  | MUL | NULL    |       |
| average_heading | double      | YES  |     | NULL    |       |
+-----------------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)


show columns from tbl_par_sensor_1min_ave;
+-------------+-------------+------+-----+---------+-------+
| Field       | Type        | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| tmstamp     | varchar(24) | YES  | MUL | NULL    |       |
| average_par | double      | YES  |     | NULL    |       |
+-------------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

解决了:

实施 setsuna 的更改后:

单外连接:

SELECT cal.dt, tbl_gyro_hdt_1min_ave.average_heading
FROM cal
LEFT JOIN tbl_gyro_hdt_1min_ave
ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp  

Fetched 30243 records 
Duration: 0.015 sec
Fetched in: 0.172 sec

双外连接:

SELECT cal.dt, tbl_gyro_hdt_1min_ave.average_heading, tbl_par_sensor_1min_ave.average_par
FROM cal
LEFT JOIN tbl_gyro_hdt_1min_ave
ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp  
LEFT JOIN tbl_par_sensor_1min_ave
ON cal.dt = tbl_par_sensor_1min_ave.tmstamp  

Fetched 29543 records
Duration: 0.000s
Fetched in: 0.281 sec
4

2 回答 2

0

将列cal.dt更改为NOT NULL以及将tmstamp 更改TIMESTAMPDATETIMENOT NULL。具有约 30,000 条记录和正确索引的 JOIN 条件字段的 JOIN 应该运行得非常快。

注意: @Knapie 已经给出了这个答案的结果。

于 2012-08-15T08:55:24.827 回答
0

解决了!

感谢setsuna(见评论)

将列 cal.dt 更改为 NOT NULL 以及将 tmstamp 更改为 TIMESTAMP 或 DATETIME 和 NOT NULL。具有约 30,000 条记录和正确索引的 JOIN 条件字段的 JOIN 应该运行得非常快。

实施 setsuna 的更改后:

单外连接:

SELECT cal.dt, tbl_gyro_hdt_1min_ave.average_heading
FROM cal
LEFT JOIN tbl_gyro_hdt_1min_ave
ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp 

Fetched 30243 records 
Duration: 0.015 sec
Fetched in: 0.172 sec

双外连接:

SELECT cal.dt, tbl_gyro_hdt_1min_ave.average_heading, tbl_par_sensor_1min_ave.average_par
FROM cal
LEFT JOIN tbl_gyro_hdt_1min_ave
ON cal.dt = tbl_gyro_hdt_1min_ave.tmstamp  
LEFT JOIN tbl_par_sensor_1min_ave
ON cal.dt = tbl_par_sensor_1min_ave.tmstamp  

Fetched 29543 records
Duration: 0.000s
Fetched in: 0.281 sec
于 2012-08-15T08:36:15.777 回答