3

我正在尝试从有关一组客户端和服务器之间连接的某些大数据中查询一些信息。以下是表中相关列的示例数据(connection_stats):

+---------------------------------------------------------+
|   timestamp         | client_id | server_id |  status   | 
+---------------------------------------------------------+
| 2013-07-06 10:40:30 |   100     |   800     |  SUCCESS  |
+---------------------------------------------------------+
| 2013-07-06 10:40:50 |   101     |   801     |  FAILED   |
+---------------------------------------------------------+
| 2013-07-06 10:42:00 |   100     |   800     |  ABORTED  |
+---------------------------------------------------------+
| 2013-07-06 10:43:30 |   100     |   801     |  SUCCESS  |
+---------------------------------------------------------+
| 2013-07-06 10:56:00 |   100     |   800     |  FAILED   |
+---------------------------------------------------------+

从这个表中,我试图查询连接状态“ABORTED”的所有实例(按照时间戳的顺序),对于每个 client_id, server_id 对,连接状态“FAILED”紧随其后。我想同时获得这两条记录——状态为“ABORTED”的记录和状态为“FAILED”的记录。在上面的数据样本中有一种这样的情况——对于 100、800 对,在“ABORTED”之后立即出现“FAILED”状态。

我是 SQL 和数据库方面的新手,对此我完全迷失了。任何有关如何解决此问题的指针将不胜感激。

数据库是mysql。

4

7 回答 7

2

诚然不是很优雅,但我可以直接想出与没有 CTE 或排名函数的 MySQL 一起使用的蝙蝠,并且没有保证可以使用的唯一行 ID。

SELECT aborted.* FROM Table1 aborted JOIN Table1 failed
  ON aborted.server_id = failed.server_id 
 AND aborted.client_id = failed.client_id
 AND aborted.timestamp < failed.timestamp
LEFT JOIN Table1 filler
  ON filler.server_id = aborted.server_id
 AND filler.client_id = aborted.client_id
 AND aborted.timestamp < filler.timestamp
 AND filler.timestamp < failed.timestamp
WHERE filler.timestamp IS NULL
  AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'
UNION
SELECT failed.* FROM Table1 aborted JOIN Table1 failed
  ON aborted.server_id = failed.server_id
 AND aborted.client_id = failed.client_id
 AND aborted.timestamp < failed.timestamp
LEFT JOIN Table1 filler
  ON filler.server_id = aborted.server_id
 AND filler.client_id = aborted.client_id
 AND aborted.timestamp < filler.timestamp
 AND filler.timestamp < failed.timestamp
WHERE filler.timestamp IS NULL
  AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'

一个用于测试的 SQLfiddle

如果您对汇总了两条记录的一行感到满意,您可以从 aborted/failed 中选择您想要的字段并跳过整个联合的后半部分(即查询将被切成两半)

因为我得到了关于 的评论,所以UNION这里使用的是相同的东西JOIN,假设时间戳对于每个客户端/服务器组合都是唯一的(唯一的行 ID 在这里会有所帮助);

SELECT * FROM Table1 t JOIN
(
 SELECT 
   aborted.server_id asid, aborted.client_id acid, aborted.timestamp ats,
    failed.server_id fsid,  failed.client_id fcid,  failed.timestamp fts
 FROM Table1 aborted JOIN Table1 failed
   ON aborted.server_id = failed.server_id
  AND aborted.client_id = failed.client_id
  AND aborted.timestamp < failed.timestamp
 LEFT JOIN Table1 filler
   ON filler.server_id = aborted.server_id
  AND filler.client_id = aborted.client_id
  AND aborted.timestamp < filler.timestamp
  AND filler.timestamp < failed.timestamp
 WHERE filler.timestamp IS NULL
   AND aborted.status = 'ABORTED' AND failed.status = 'FAILED'
) u
WHERE t.server_id=asid AND t.client_id=acid AND t.timestamp=ats
   OR t.server_id=fsid AND t.client_id=fcid AND t.timestamp=fts
ORDER BY timestamp

一个用于测试的 SQLfiddle

于 2013-07-06T11:30:10.790 回答
1

我正在回答这个问题(虽然迟了),因为我想提供一种更通用的方法。MySQL 没有lag()orlead()函数,但您可以使用子查询来实现它。这个想法是查找 client_id/server_id 对的下一个时间戳,然后加入原始数据以获取完整记录。这允许您从“下一个”记录中提取尽可能多的记录。它还允许您考虑更复杂的关系(例如,“失败”必须在 3 分钟内):

select cs.*, csnext.timestamp as nextTimeStamp, csnext.status as nextStatus
from (select cs.*,
             (select timestamp
              from connection_stats cs2
              where cs2.client_id = cs.client_id and
                    cs2.server_id = cs.server_id and
                    cs2.timestamp > cs.timestamp
              order by cs2.timestamp
              limit 1
             ) as Nexttimestamp
      from connection_stats cs
     ) cs join
     connection_stats csnext
     on csnext.client_id = cs.client_id and
        csnext.server_id = cs.server_id and
        csnext.timestamp = cs.nexttimestamp
where cs.status = 'ABORTED' and
      csnext.status = 'FAILED'

通过在connection_stats(client_id, server_id, timestamp).

于 2013-07-06T12:20:13.007 回答
0

我没有要测试的 MySQL 数据库,但你可能会试一试。可能需要按列添加一些分组。

SELECT aborted.*, failed.*
FROM connection_stats aborted
INNER JOIN connection_status nexterror ON aborted.client_id = nexterror.client_id AND nexterror.timestamp > aborted.timestamp
INNER JOIN connection_status failed ON aborted.client_id = failed.client_id AND failed.STATUS = 'FAILED' AND failed.timestamp = MIN(nexterror.timestamp)
WHERE aborted.STATUS = 'ABORTED'
于 2013-07-06T11:25:45.667 回答
0

select * from table t1, table t2 where t1.server_id = t2.server_id and t1.status = 'ABORTED' and t2= 'FAILED'

于 2013-07-06T11:23:08.373 回答
0

您可以对状态进行分组,并可以按顺序匹配

SELECT client_id,server_id,GROUP_CONCAT(status) as abort_fail
FROM   `table`    
GROUP  BY client_id,server_id
HAVING abort_fail ='ABORTED,FAILED'
ORDER  BY `timestamp` DESC

现在使用GROUP_CONCAT记住有 1000 个字符的字符限制,所以你应该注意它

于 2013-07-06T11:23:59.037 回答
0

不太优雅,但应该可以工作。基于GROUP_CONCAT()

演示

SELECT client_id,server_id,GROUP_CONCAT(status) as all_statuses
FROM   statuses
GROUP  BY client_id,server_id
HAVING all_statuses LIKE '%ABORTED,FAILED%'
ORDER  BY timestamp
于 2013-07-06T11:15:43.873 回答
0
SELECT t0.clientid, t0.serverid
        , t0.logtime AS abort_time
        , t1.logtime AS fail_time
FROM tmp t0
JOIN tmp t1 ON t1.clientid = t0.clientid AND t1.serverid = t0.serverid
        -- t1 after t0
        AND t1.logtime > t0.logtime
WHERE t0. status = 'ABORTED'
AND t1. status = 'FAILED'
        -- no records inbetween 'aborted' and 'failed'
        -- (not even different 'aborted' and 'failed' records)
AND NOT EXISTS (
        SELECT *
        FROM tmp x
        WHERE x.clientid = t0.clientid AND x.serverid = t0.serverid
        AND x.logtime > t0.logtime
        AND x.logtime < t1.logtime
        )
        ;

更新:如果你想检索两个没有加入的记录,但作为单独的记录,你可以这样做:

SELECT t0.*
FROM tmp t0
JOIN (
        SELECT t1.clientid, t1.serverid
        , t1.logtime AS abort_time
        , t2.logtime AS fail_time
        FROM tmp t1
        JOIN tmp t2 ON t2.clientid = t1.clientid AND t2.serverid = t1.serverid
                -- t2 after t1
                AND t2.logtime > t1.logtime
        WHERE t1. status = 'ABORTED'
        AND t2. status = 'FAILED'
                -- no records inbetween 'aborted' and 'failed'
                -- (not even different 'aborted' and 'failed' records)
        AND NOT EXISTS (
                SELECT *
                FROM tmp x
                WHERE x.clientid = t1.clientid AND x.serverid = t1.serverid
                AND x.logtime > t1.logtime
                AND x.LOGTIME < t2.logtime
                )
        ) two ON two.clientid = t0.clientid AND two.serverid = t0.serverid
                AND (two.abort_time = t0.logtime OR two.fail_time = t0.logtime)
        ;

, 或者重写为 EXISTS 子句,有时会更简洁,因为 t1,t2 表不会泄漏到外部查询中:

SELECT *
FROM tmp t0
WHERE EXISTS (
        SELECT *
        FROM tmp t1
        JOIN tmp t2 ON t2.clientid = t1.clientid AND t2.serverid = t1.serverid
                -- t2 after t1
                AND t2.logtime > t1.logtime
        WHERE t1. status = 'ABORTED'
        AND t2. status = 'FAILED'
        AND t1.clientid = t0.clientid AND t1.serverid = t0.serverid
        AND t1.logtime = t0.logtime OR t2.logtime = t0.logtime
                -- no records inbetween 'aborted' and 'failed'
                -- (not even different 'aborted' and 'failed' records)
        AND NOT EXISTS (
                SELECT *
                FROM tmp x
                WHERE x.clientid = t1.clientid AND x.serverid = t1.serverid
                AND x.logtime > t1.logtime
                AND x.LOGTIME < t2.logtime
                )
                )
        ;
于 2013-07-06T11:53:18.427 回答