0

我有一个 MySQL 表:

CREATE TABLE IF NOT EXISTS users_data (
  userid int(11) NOT NULL,
  computer varchar(30) DEFAULT NULL,
  logondate date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

这是一个包含大约 400 个唯一用户和 20 台计算机的大表,以及来自 5 年用户登录计算机的大约 20,000 个条目。

我想创建一个汇总表,列出每台特定计算机每年的唯一用户数,以及这些用户中有多少是新用户(即,在那一年之前没有登录到任何计算机的先前实例,此外对于将来没有更多登录任何计算机实例的用户:

CREATE TABLE IF NOT EXISTS summary_computer_use (
  computer varchar(30) DEFAULT NULL,
  year_used date NOT NULL,
  number_of_users int(11) NOT NULL,
  number_of_new_users int(11) NOT NULL,
  number_of_terminated_users int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT into summary_computer_use (computer, year_used)
    select computer, distinct year(logondate) from users_data;

我每年可以获得唯一用户:

UPDATE summary_computer_use as a 
inner join (
    select computer, year(logondate) as year_used,
        count(distinct userid) as number_of_users
    from users_data
    group by computer, year(logondate)
) as b on a.computer = b.computer and 
a.year_used = b.year_used
set a.number_of_users = b.number_of_users;

但是我很难理解如何编写一个选择语句来查找给定年份中首次使用计算机的用户数量(没有早于该给定年份的登录日期)或从未再次登录的用户数量。

有什么建议么?

4

2 回答 2

0

我认为这会产生您想要的摘要:

   SELECT computers.computer,
          timespan.yyyy                 AS "year_used",
          COALESCE(allusers.num, 0)     AS "number_of_users",
          COALESCE(newusers.num, 0)     AS "number_of_new_users",
          COALESCE(terminations.num, 0) AS "number_of_terminated_users"
     FROM (SELECT DISTINCT computer
             FROM users_data) computers
     JOIN (SELECT (2000+i) AS yyyy
             FROM integers
            WHERE i BETWEEN 0 AND 10) timespan
LEFT JOIN (  SELECT YEAR(logondate) AS logonyear,
                   computer,
                   COUNT(DISTINCT userid) AS "num"
              FROM users_data
          GROUP BY 1, 2) allusers
       ON timespan.yyyy = allusers.logonyear AND computers.computer = allusers.computer
LEFT JOIN ( SELECT last_logon AS logonyear,
                   computer,
                   COUNT(DISTINCT userid) AS "num"
              FROM (  SELECT computer,
                             userid,
                             YEAR(MAX(logondate)) AS "last_logon"
                        FROM users_data
                    GROUP BY 1, 2) last_user_logons
           GROUP BY 1, 2) terminations
       ON timespan.yyyy = terminations.logonyear AND computers.computer = terminations.computer
LEFT JOIN ( SELECT first_logon AS logonyear,
                   computer,
                   COUNT(DISTINCT userid) AS "num"
              FROM (  SELECT computer,
                             userid,
                             YEAR(MIN(logondate)) AS "first_logon"
                        FROM users_data
                    GROUP BY 1, 2) first_user_logons
           GROUP BY 1, 2) newusers
       ON timespan.yyyy = newusers.logonyear AND computers.computer = newusers.computer;

这些不同的子查询代表:

  • 不同的集合computers
  • timespan我们感兴趣的年份
    • 注意:使用整数表
    • 注意:我们排除了去年(2011 年,在撰写本文时),因为在今年完成之前,我们不能“关闭”去年的终止。
  • 各年计算机不同用户数 ( allusers)
  • newusers每年的计算机数量
    (建立在计算机上用户的所有first_logon记录之上)
  • terminations按年计算的计算机数量(建立在所有记录
    之上)last_logon
于 2012-05-11T15:07:45.587 回答
0

这就是你所追求的:

select y, count(userid) as newusers from
(
    select userid, min(year(logondate)) as y from users_data group by userid
) tmp
group by y;
于 2012-05-10T02:50:34.533 回答