2

我有这张表,我可以通过 linq-to-entities 访问:

  • 数量
  • 类型1或2)

我在同一天有很多行并且类型相同,我需要按月汇总它们的数量。这很容易:

from r in rows
group r by new { r.Date.Year, r.Date.Month }
into g
select
    new
        {
            Date = new DateTime(g.Key.Year, g.Key.Month, 1),
            Hours = g.Sum(a => a.Amount)
        };

但是,我有一个特殊规则需要在同一个 LINQ 中实现,我需要一些帮助:

如果在某一天有任何类型 2,那么那一天应该只是类型 2 的总和。否则它应该总结类型 1。

请注意,类型 1 和类型 2 之间的区别是每天,总和是每月。

更新 当我处理大量数据时,我需要在一次数据库调用中获取所有数据,我无法将其加载到内存中并在那里进行管理。

4

4 回答 4

1

你会怎么看?

var query =
    from r in rows.ToArray()
    group r by new { r.Date.Year, r.Date.Month } into g
    let lookup = g.ToLookup(x => x.Type, x => x.Amount)
    let Hours = lookup[2].Any() ? lookup[2].Sum() : lookup[1].Sum()
    select new
    {
        Date = new DateTime(g.Key.Year, g.Key.Month, 1),
        Hours,
    };

请注意,.ToArray()因为您需要将数据放入内存才能完成这项工作。

我假设这Type是一个整数,其值为12


进一步考虑,我认为不可能在一个查询中进行这种分组而不将其放入内存中。

所以最好的选择是最小化内存。如果这不起作用,那么您需要将其分解为几个查询。

var query1 =
    from r in rows
    group r.Amount by new
    {
        r.Date.Year,
        r.Date.Month,
        r.Type,
    } into g
    select new
    {
        g.Key.Year,
        g.Key.Month,
        g.Key.Type,
        Amount = g.Sum(),
    };

var query2 =
    from r in query1.ToArray()
    group r by new
    {
        r.Year,
        r.Month,
    } into g
    let lookup = g.ToLookup(x => x.Type, x => x.Amount)
    let Hours = lookup[2].Any() ? lookup[2].Sum() : lookup[1].Sum()
    select new
    {
        Date = new DateTime(g.Key.Year, g.Key.Month, 1),
        Hours,
    };

这是一种可能性吗?

于 2012-07-31T11:16:23.933 回答
1

好吧,我想我已经弄清楚了。我已经用一些数据对其进行了测试,并且有效。

测试数据和 SQL 测试查询:

DECLARE @table TABLE (
    Datum DATETIME,
    Amount INT,
    [Type] INT
)

INSERT INTO @table (Datum, Amount, [Type]) values
('2012-01-01',200,1),
('2012-01-01',100,2),
('2012-01-02',500,1),
('2012-03-01',200,1),
('2012-03-01',100,1),
('2012-03-02',500,2)

SELECT MONTH(Datum), YEAR(Datum), COUNT(*), SUM(Amount)
FROM @table t
INNER JOIN (
    SELECT DAY (Datum) AS _day, MONTH(Datum) AS _month, YEAR(Datum) _year,
    MAX([Type]) as _type
    FROM @table
    GROUP BY DAY (Datum), MONTH(Datum), YEAR(Datum)
) X
ON _month = MONTH (T.Datum)
AND _year = YEAR(T.Datum)
AND _day = DAY(T.Datum)
AND _type = T.[Type]
GROUP BY MONTH(Datum), YEAR(Datum)

结果:

(No column name)    (No column name)    (No column name)    (No column name)
1   2012    2   600
3   2012    3   800

翻译后的 LINQ 查询,使用 L2S 和带有测试数据的“真实”测试表进行测试。

using (DataClasses1DataContext ctx = new DataClasses1DataContext()) {
    var rows = ctx.Tests;

    var query = rows
        .Join(
            rows.GroupBy(rr =>
                new { rr.Datum.Day, rr.Datum.Month, rr.Datum.Year },
                (key, data) => new { Year = key.Year, Month = key.Month, Day = key.Day, MaxType = data.Select(xxx => xxx.Type).Max() }
            ),
            rr => new { Day = rr.Datum.Day, Month = rr.Datum.Month, Year = rr.Datum.Year, Type = rr.Type },
            rr => new { Day = rr.Day, Month = rr.Month, Year = rr.Year, Type = rr.MaxType },
            (r, r1) => r

        )
        .GroupBy(r =>
            new { Year = r.Datum.Year, Month = r.Datum.Month },
            (key, data) => new { Year = key.Year, Month = key.Month, Amount = data.Select(xx => xx.Amount).Sum() }
        )
        .ToList();
}

这将返回相同的结果。

而且,为了好玩,L2S 从 Linq 查询生成的 SQL 查询。

SELECT [t7].[value] AS [Year], [t7].[value2] AS [Month], (
    SELECT SUM([t8].[Amount])
    FROM [dbo].[Test] AS [t8]
    INNER JOIN (
        SELECT [t11].[value3], [t11].[value2], [t11].[value], (
            SELECT MAX([t12].[Type])
            FROM [dbo].[Test] AS [t12]
            WHERE ((([t11].[value] IS NULL) AND (DATEPART(Day, [t12].[Datum]) IS NULL)) OR (([t11].[value] IS NOT NULL) AND (DATEPART(Day, [t12].[Datum]) IS NOT NULL) AND ((([t11].[value] IS NULL) AND (DATEPART(Day, [t12].[Datum]) IS NULL)) OR (([t11].[value] IS NOT NULL) AND (DATEPART(Day, [t12].[Datum]) IS NOT NULL) AND ([t11].[value] = DATEPART(Day, [t12].[Datum])))))) AND ((([t11].[value2] IS NULL) AND (DATEPART(Month, [t12].[Datum]) IS NULL)) OR (([t11].[value2] IS NOT NULL) AND (DATEPART(Month, [t12].[Datum]) IS NOT NULL) AND ((([t11].[value2] IS NULL) AND (DATEPART(Month, [t12].[Datum]) IS NULL)) OR (([t11].[value2] IS NOT NULL) AND (DATEPART(Month, [t12].[Datum]) IS NOT NULL) AND ([t11].[value2] = DATEPART(Month, [t12].[Datum])))))) AND ((([t11].[value3] IS NULL) AND (DATEPART(Year, [t12].[Datum]) IS NULL)) OR (([t11].[value3] IS NOT NULL) AND (DATEPART(Year, [t12].[Datum]) IS NOT NULL) AND ((([t11].[value3] IS NULL) AND (DATEPART(Year, [t12].[Datum]) IS NULL)) OR (([t11].[value3] IS NOT NULL) AND (DATEPART(Year, [t12].[Datum]) IS NOT NULL) AND ([t11].[value3] = DATEPART(Year, [t12].[Datum]))))))
            ) AS [value4]
        FROM (
            SELECT [t10].[value], [t10].[value2], [t10].[value3]
            FROM (
                SELECT DATEPART(Day, [t9].[Datum]) AS [value], DATEPART(Month, [t9].[Datum]) AS [value2], DATEPART(Year, [t9].[Datum]) AS [value3]
                FROM [dbo].[Test] AS [t9]
                ) AS [t10]
            GROUP BY [t10].[value], [t10].[value2], [t10].[value3]
            ) AS [t11]
        ) AS [t13] ON (DATEPART(Day, [t8].[Datum]) = [t13].[value]) AND (DATEPART(Month, [t8].[Datum]) = [t13].[value2]) AND (DATEPART(Year, [t8].[Datum]) = [t13].[value3]) AND ([t8].[Type] = [t13].[value4])
    WHERE ((([t7].[value] IS NULL) AND (DATEPART(Year, [t8].[Datum]) IS NULL)) OR (([t7].[value] IS NOT NULL) AND (DATEPART(Year, [t8].[Datum]) IS NOT NULL) AND ((([t7].[value] IS NULL) AND (DATEPART(Year, [t8].[Datum]) IS NULL)) OR (([t7].[value] IS NOT NULL) AND (DATEPART(Year, [t8].[Datum]) IS NOT NULL) AND ([t7].[value] = DATEPART(Year, [t8].[Datum])))))) AND ((([t7].[value2] IS NULL) AND (DATEPART(Month, [t8].[Datum]) IS NULL)) OR (([t7].[value2] IS NOT NULL) AND (DATEPART(Month, [t8].[Datum]) IS NOT NULL) AND ((([t7].[value2] IS NULL) AND (DATEPART(Month, [t8].[Datum]) IS NULL)) OR (([t7].[value2] IS NOT NULL) AND (DATEPART(Month, [t8].[Datum]) IS NOT NULL) AND ([t7].[value2] = DATEPART(Month, [t8].[Datum]))))))
    ) AS [Amount]
FROM (
    SELECT [t6].[value], [t6].[value2]
    FROM (
        SELECT DATEPART(Year, [t0].[Datum]) AS [value], DATEPART(Month, [t0].[Datum]) AS [value2]
        FROM [dbo].[Test] AS [t0]
        INNER JOIN (
            SELECT [t3].[value3], [t3].[value2], [t3].[value], (
                SELECT MAX([t4].[Type])
                FROM [dbo].[Test] AS [t4]
                WHERE ((([t3].[value] IS NULL) AND (DATEPART(Day, [t4].[Datum]) IS NULL)) OR (([t3].[value] IS NOT NULL) AND (DATEPART(Day, [t4].[Datum]) IS NOT NULL) AND ((([t3].[value] IS NULL) AND (DATEPART(Day, [t4].[Datum]) IS NULL)) OR (([t3].[value] IS NOT NULL) AND (DATEPART(Day, [t4].[Datum]) IS NOT NULL) AND ([t3].[value] = DATEPART(Day, [t4].[Datum])))))) AND ((([t3].[value2] IS NULL) AND (DATEPART(Month, [t4].[Datum]) IS NULL)) OR (([t3].[value2] IS NOT NULL) AND (DATEPART(Month, [t4].[Datum]) IS NOT NULL) AND ((([t3].[value2] IS NULL) AND (DATEPART(Month, [t4].[Datum]) IS NULL)) OR (([t3].[value2] IS NOT NULL) AND (DATEPART(Month, [t4].[Datum]) IS NOT NULL) AND ([t3].[value2] = DATEPART(Month, [t4].[Datum])))))) AND ((([t3].[value3] IS NULL) AND (DATEPART(Year, [t4].[Datum]) IS NULL)) OR (([t3].[value3] IS NOT NULL) AND (DATEPART(Year, [t4].[Datum]) IS NOT NULL) AND ((([t3].[value3] IS NULL) AND (DATEPART(Year, [t4].[Datum]) IS NULL)) OR (([t3].[value3] IS NOT NULL) AND (DATEPART(Year, [t4].[Datum]) IS NOT NULL) AND ([t3].[value3] = DATEPART(Year, [t4].[Datum]))))))
                ) AS [value4]
            FROM (
                SELECT [t2].[value], [t2].[value2], [t2].[value3]
                FROM (
                    SELECT DATEPART(Day, [t1].[Datum]) AS [value], DATEPART(Month, [t1].[Datum]) AS [value2], DATEPART(Year, [t1].[Datum]) AS [value3]
                    FROM [dbo].[Test] AS [t1]
                    ) AS [t2]
                GROUP BY [t2].[value], [t2].[value2], [t2].[value3]
                ) AS [t3]
            ) AS [t5] ON (DATEPART(Day, [t0].[Datum]) = [t5].[value]) AND (DATEPART(Month, [t0].[Datum]) = [t5].[value2]) AND (DATEPART(Year, [t0].[Datum]) = [t5].[value3]) AND ([t0].[Type] = [t5].[value4])
        ) AS [t6]
    GROUP BY [t6].[value], [t6].[value2]
    ) AS [t7]

我不知道这个 SQL 的效率有多高,你将不得不尝试它。

于 2012-07-31T12:59:11.743 回答
1
rows
    .GroupBy(
        r => new { r.Date.Year, r.Date.Month, r.Date.Day, r.Type },
        (r, rr) => new { r.Year, r.Month, r.Day, r.Type, Amount = rr.Sum(rrr => rrr.Amount) })
    .GroupBy(
        r => new { r.Year, r.Month, r.Day },
        (r, rr) => new { r.Year, r.Month, r.Day, Amount = rr.OrderByDescending(rrr => rrr.Type).Select(rrr => rrr.Amount).First() })
    .GroupBy(
        r => new { r.Year, r.Month },
        (r, rr) => new { r.Year, r.Month, Amount = rr.Sum(rrr => rrr.Amount) })

这背后的比率非常简单:“如果至少有一个类型 2 记录,则只计算类型 2 记录”的要求可以通过简单地按类型对记录进行分组来实现(当然是在几天内)。为什么它有效?因为我们将所有记录分成两组,类型 2(如果至少有一个类型 2 记录,则应该使用)和类型 1(实际上表示“没有类型 2 时的所有记录”)。第二部分(选择总和)更简单:我们只需按类型降序(即类型 2 的总和,类型 1 的总和)对组进行排序(即,类型 2 的总和)并取第一个,如果存在则为我们的类型 2,否则为类型 1。

坦率地说,这是一种所有人都讨厌的“智能代码”,因为没有人一眼就能理解它是如何工作的。

于 2012-07-31T13:11:14.373 回答
0

这个怎么样?

from r in rows
group r by new { r.Date.Year, r.Date.Month }
into g
let type2Days = g.Where( a => a.Type == 2 ).Select( a => a.Date.Day ).Distinct()
let filtered = g.Where( a => a.Type == 2 || type2Days.Contains(a.Date.Day) == false )
select
    new
        {
            Date = new DateTime(g.Key.Year, g.Key.Month, 1),
            Hours = filtered.Sum(a => a.Amount)
        };
于 2012-07-31T23:00:51.127 回答