0

我有一个包含字段(id,letter,date)和一些数据的表:

1 A 2012-01-01
2 B NULL
3 C NULL
4 D 2012-01-15

我想用最近的非 NULL 值的平均日期填充 NULL 值。像那样:

1 A 2012-01-01
2 B 2012-01-08
3 C 2012-01-08
4 D 2012-01-15

或者,也许,甚至像这样:

1 A 2012-01-01
2 B 2012-01-08
3 C 2012-01-11
4 D 2012-01-15

两种变体都很棒。有没有简单的方法在 MySQL 中实现它?

提前致谢

UPD 表非常大,大约有 700.000 条记录,并且像描述的那样大约有 50.000 个间隙。

UPD2 更干净一点:表可能是这样的:

1 A 2012-01-01
2 B NULL
3 C NULL
4 D 2012-01-15
5 E NULL
6 F 2012-01-17
7 G NULL
8 H NULL
9 I 2012-01-20

预期的结果如下:

1 A 2012-01-01
2 B **2012-01-08**
3 C **2012-01-08**
4 D 2012-01-15
5 E **2012-01-16**
6 F 2012-01-17
7 G **2012-01-18**
8 H **2012-01-18**
9 I 2012-01-20

(星号表示更改的值)。谢谢

UPD3 谢谢大家。但我会以另一种方式来做,用一个简单的公式计算日期:need_date = [(max(date)-min(date))/(max(id)-min(id)]*(my_ID-min(id )) + 分钟(日期)

4

2 回答 2

1

假设你有一个T这样的表:

CREATE TABLE T(
    id INT,
    time DATETIME
);

以下查询将为您提供每个 NULL 记录的边界:

SELECT T.Id
     , MAX(T1.Time) as MinDate
     , MIN(T2.Time) as MaxDate     
  FROM T
INNER JOIN T T1 ON T1.Id < T.Id
               AND T.time IS NULL 
               AND NOT T1.time IS NULL
INNER JOIN T T2 ON T2.id > T.id
               AND T.time IS NULL
               AND NOT T2.time IS NULL
GROUP BY Id

输出将是:

Id  MinDate     MaxDate
2   2012-01-01  2012-01-15
3   2012-01-01  2012-01-15

因此,下一步将是使用此结果集中的值进行更新,以更新 NULL 值,例如......

UPDATE T
INNER JOIN 
(
   SELECT T.Id, MAX(T1.Time) as MinTime, MIN(T2.Time) as MaxTime
     FROM T
   INNER JOIN T T1 ON T1.id < T.id
                 AND T.time IS NULL 
                 AND NOT T1.time IS NULL
   INNER JOIN T T2 ON T2.id > T.id
                 AND T.time IS NULL
                 AND NOT T2.time IS NULL    
   GROUP BY T.ID) T3
 ON T3.id = T.id  
 SET T.time = FROM_UNIXTIME((UNIX_TIMESTAMP(T3.MinTime) + UNIX_TIMESTAMP(T3.MaxTime)) / 2)
 WHERE T.time IS NULL

在这里工作 SQLFiddle

于 2013-03-22T18:22:05.493 回答
1

查询 #1

SELECT id,letter,IFNULL(date,dt) date FROM mytable,
(SELECT DATE(mindate + INTERVAL (secdiff/2) SECOND) dt
FROM (SELECT mindate,UNIX_TIMESTAMP(maxdate)
- UNIX_TIMESTAMP(mindate) secdiff
FROM (SELECT MIN(date) mindate FROM mytable) N,
(SELECT MAX(date) maxdate FROM mytable) X) AA) A;

样本数据

mysql> DROP TABLE IF EXISTS mytable;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE mytable
    -> (
    ->    id int not null auto_increment,
    ->    letter char(1),
    ->    `date` date,
    ->    primary key (id)
    -> );
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO mytable (letter,date) VALUES
    -> ('A','2012-01-01'),('B',NULL),('C',NULL),('D','2012-01-15');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM mytable;
+----+--------+------------+
| id | letter | date       |
+----+--------+------------+
|  1 | A      | 2012-01-01 |
|  2 | B      | NULL       |
|  3 | C      | NULL       |
|  4 | D      | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.00 sec)

mysql>

查询 #1 已执行

mysql> SELECT id,letter,IFNULL(date,dt) date FROM mytable,
    -> (SELECT DATE(mindate + INTERVAL (secdiff/2) SECOND) dt
    -> FROM (SELECT mindate,UNIX_TIMESTAMP(maxdate)
    -> - UNIX_TIMESTAMP(mindate) secdiff
    -> FROM (SELECT MIN(date) mindate FROM mytable) N,
    -> (SELECT MAX(date) maxdate FROM mytable) X) AA) A;
+----+--------+------------+
| id | letter | date       |
+----+--------+------------+
|  1 | A      | 2012-01-01 |
|  2 | B      | 2012-01-08 |
|  3 | C      | 2012-01-08 |
|  4 | D      | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.00 sec)

mysql>

QUERY #2(更简洁的版本)

此查询使用 UNIX 时间戳的平均值。如果所有日期都为 NULL,则使用今天的日期:

SELECT id,letter,IFNULL(date,dt) date FROM mytable,
(
    SELECT IF(K=0,DATE(NOW()),avgdt) dt FROM
    (SELECT DATE(FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(date))))
    avgdt FROM mytable) AA,
    (SELECT COUNT(date) K FROM mytable) BB
) A;

查询 #2 已执行

mysql> SELECT id,letter,IFNULL(date,dt) date FROM mytable,
    -> (
    ->     SELECT IF(K=0,DATE(NOW()),avgdt) dt FROM
    ->     (SELECT DATE(FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(date))))
    ->     avgdt FROM mytable) AA,
    ->     (SELECT COUNT(date) K FROM mytable) BB
    -> ) A;
+----+--------+------------+
| id | letter | date       |
+----+--------+------------+
|  1 | A      | 2012-01-01 |
|  2 | B      | 2012-01-08 |
|  3 | C      | 2012-01-08 |
|  4 | D      | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.05 sec)

mysql>

试试看 !!!

于 2013-03-22T18:30:54.293 回答