0

我有一个content包含四列的集合;id, timestamp,locationIDauthorID. 这是我的数据示例;在生产中,这是数千万行的长度。

id    timestamp              locationID   authorID
1     2012-03-01 11:52:00    1            1
2     2012-03-16 19:56:00    1            2
3     2012-04-02 11:26:00    2            1
4     2012-04-22 11:52:00    2            3
5     2012-05-19 09:48:00    2            2
6     2012-05-30 07:12:00    2            1
7     2012-06-04 19:17:00    1            2

我想authorIDs收集contenttimestamp特定locationID.

查询的正确值locationID = 2是:[ 1, 3 ],因为authorID1 和 3 最近在 '看到' locationID = 2,而authorID2 的最新内容在locationID1。

我当然可以每个 执行一个查询authorID,但在生产中,authorID数组的长度 > 100,000。这似乎非常低效(特别是当每个“子查询”都会达到这个数百万行content集合时),我正在寻找一种更好的方法来从我的数据集中出现这些数据,理想情况下速度足够快,可以在页面渲染上执行。

4

2 回答 2

1

尝试派生子查询

SELECT
    *
FROM content  as c
INNER JOIN(
            SELECT 
                MAX(id) as ID
            FROM content 
            WHERE locationID = 2
            GROUP BY authorID
) as t on t.ID = c.id

SQL 小提琴演示

于 2013-01-05T04:01:15.087 回答
1

像这样的东西?这是来自 SQL Server,但我认为它也应该在 mySQL 中工作。

DECLARE @locationId INT
SET @locationId = 2;

SELECT * 
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
    FROM Content C
    WHERE LocationId = @locationId
    GROUP BY AuthorId) AS CBL
    LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
        AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL

对于 locationId = 2,它返回 1 和 3;对于 locationId = 1,它返回 2

每个 JW(谢谢!),正确的 mySql 方法:

SET @locationId := 2;

SELECT * 
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
    FROM Content C
    WHERE LocationId = @locationId
    GROUP BY AuthorId) AS CBL
    LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
        AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL
于 2013-01-05T04:01:51.460 回答