0

I want to create a new Table B based on the information from another existing Table A. I'm wondering if MySQL has the functionality to take into account a range of time and group column A values then only sum up the values in a column B based on those groups in column A.

Table A stores logs of events like a journal for users. There can be multiple events from a single user in a single day. Say hypothetically I'm keeping track of when my users eat fruit and I want to know how many fruit they eat in a week (7days) and also how many apples they eat.

So in Table B I want to count for each entry in Table A, the previous 7 day total # of fruit and apples.

EDIT:
I'm sorry I over simplified my given information and didn't thoroughly think my example.

I'm initially have only Table A. I'm trying to create Table B from a query.

Assume:

  • User/id can log an entry multiple times in a day.
  • sum counts should be for id between date and date - 7 days
  • fruit column stands for the total # of fruit during the 7 day interval ( apples and bananas are both fruit)
  • The data doesn't only start at 2013-9-5. It can date back 2000 and I want to use the 7 day sliding window over all the dates between 2000 to 2013.

The sum count is over a sliding window of 7 days

Here's an example:

Table A:                           

| id | date-time          | apples | banana |     
---------------------------------------------
|  1 | 2013-9-5 08:00:00  |   1    |   1    |  
|  2 | 2013-9-5 09:00:00  |   1    |   0    |   
|  1 | 2013-9-5 16:00:00  |   1    |   0    |  
|  1 | 2013-9-6 08:00:00  |   0    |   1    |    
|  2 | 2013-9-9 08:00:00  |   1    |   1    |  
|  1 | 2013-9-11 08:00:00 |   0    |   1    |   
|  1 | 2013-9-12 08:00:00 |   0    |   1    |   
|  2 | 2013-9-13 08:00:00 |   1    |   1    |  

note: user 1 logged 2 entries on 2013-9-5

The result after the query should be Table B.

Table B
| id | date-time          | apples | fruit  |
--------------------------------------------
|  1 | 2013-9-5 08:00:00  |   1    |   2    |
|  2 | 2013-9-5 09:00:00  |   1    |   1    |
|  1 | 2013-9-5 16:00:00  |   2    |   3    |
|  1 | 2013-9-6 08:00:00  |   2    |   4    |
|  2 | 2013-9-9 08:00:00  |   2    |   3    |
|  1 | 2013-9-11 08:00:00 |   2    |   5    |
|  1 | 2013-9-12 08:00:00 |   0    |   3    |
|  2 | 2013-9-13 08:00:00 |   2    |   4    |

At 2013-9-12 the sliding window moves and only includes 9-6 to 9-12. That's why id 1 goes from a sum of 2 apples to 0 apples.

4

2 回答 2

0

Assumptions:

  • one row per id/date
  • the counts should be for id between date and date - 7 days
  • "fruit" = "banana"
  • the "date" column is actually a date (including year) and not just month/day

then this SQL should do the trick:

INSERT INTO B
SELECT a1.id, a1.date, SUM( a2.banana ), SUM( a2.apples )
  FROM (SELECT DISTINCT id, date
          FROM A
         WHERE date > NOW() - INTERVAL 7 DAY
       ) a1
  JOIN A a2
    ON a2.id    = a1.id
   AND a2.date <= a1.date
   AND a2.date >= a1.date - INTERVAL 7 DAY
 GROUP BY a1.id, a1.date

Some questions:

  • Are the above assumptions correct?
  • Does table A contain more fruits than just Bananas and Apples? If so, what does the real structure look like?
于 2013-09-09T18:52:06.367 回答
0

您需要数据中的年份才能正确使用日期算术。我添加了它们。

你的数据中有一个奇怪的东西。您似乎每天都有每个人的多个日志条目。你假设一个隐含的顺序设置后面的日志条目以某种方式“在”早期的条目之后。如果 SQL 和 MySQL 这样做,那只是偶然:表中的行没有隐式排序。另外,如果我们重复日期/id 组合,自连接(继续阅读)会有很多重复的行并破坏总和。

因此,我们需要从创建数据的每日汇总表开始,如下所示:

    select id, `date`, sum(apples) as apples, sum(banana) as banana
      from fruit
     group by id, `date`

此摘要每天每个 ID 最多包含一行。

接下来我们需要做一个有限的交叉产品自连接,所以我们得到了 7 天的吃水果。

select --whatever--
 from (
    -- summary query --
 ) as a  
  join (
    -- same summary query once again
 ) as b   
    on (      a.id = b.id 
         and  b.`date` between a.`date` - interval 6 day AND a.`date`   )

中的between子句on给了我们 7 天(今天和之前的 6 天)。请注意,带有别名的联接中的表是b7 天的东西,而a表是今天的东西。

最后,我们必须根据您的规范总结该结果。结果查询是这样的。

  select a.id, a.`date`,
       sum(b.apples) + sum(b.banana) as fruit_last_week,
       a.apples as apple_today
  from (
        select id, `date`, sum(apples) as apples, sum(banana) as banana
          from fruit
         group by id, `date`
     ) as a  
  join (
        select id, `date`, sum(apples) as apples, sum(banana) as banana
          from fruit
         group by id, `date`
     ) as b   on (a.id = b.id and 
                      b.`date` between a.`date` - interval 6 day AND a.`date`   )
  group by a.id, a.`date`, a.apples
  order by a.`date`, a.id

这是一个小提琴: http ://sqlfiddle.com/#!2/670b2/15/0

于 2013-09-09T19:03:43.230 回答