1

我正在玩 StackOverflow 数据转储。现在我有一个 T-SQL 问题:

我可以选择一个包含每月和每年问题数量的列表:

select datepart(year, posts.creationdate) as year,
datepart(month, posts.creationdate) as month, 
count(distinct posts.id) as questions
from posts
inner join posttags on posttags.postid = posts.id
inner join tags on tags.id = posttags.tagid
where posts.posttypeid = 1
group by datepart(month, posts.creationdate), 
datepart(year, posts.creationdate)
order by datepart(year, posts.creationdate), 
datepart(month, posts.creationdate)

如果我添加and tags.tagname = 'scala'-row WHERE,那么我会得到所有“scala-questions”的数量。有什么方法可以在同一结果集中(在不同的列中)显示问题总数和包含特定标签的问题数。

因为当我添加时,and tags.tagname = 'scala'我无法再看到每月的问题总数。

关于如何将这些结果集合并为一个的任何想法?

4

2 回答 2

2

You'd need two queries to do that, since you have two sets of data (questions by month and scala questions by month). One possible solution is using common table expressions to create two "temporary views" of the data. As an example:

with total as (
    select datepart(year, posts.creationdate) as year,
           datepart(month, posts.creationdate) as month, 
           count(distinct posts.id) as questions
    from posts
        inner join posttags on posttags.postid = posts.id
        inner join tags on tags.id = posttags.tagid
    where posts.posttypeid = 1
    group by datepart(month, posts.creationdate), datepart(year, posts.creationdate)
), scala as (
    select datepart(year, posts.creationdate) as year,
           datepart(month, posts.creationdate) as month, 
           count(distinct posts.id) as questions
    from posts
        inner join posttags on posttags.postid = posts.id
        inner join tags on tags.id = posttags.tagid
     where posts.posttypeid = 1 and tags.tagname = 'scala'
    group by datepart(month, posts.creationdate), datepart(year, posts.creationdate)
)
select total.year, total.month, total.questions as total_questions, scala.questions as scala_questions
from total
    join scala on total.year = scala.year and total.month = scala.month
order by total.year, total.month​

The results of which can be seen here.

于 2011-07-30T13:31:30.277 回答
2

如果您使用left outer join反对posttagscount(posttags.tagid)将只计算非空值。而且由于左外连接只包含scala标签,您可以跳过distinctin count(distinct posts.id)

select datepart(year, posts.creationdate) as year,
       datepart(month, posts.creationdate) as month,
       count(*) as questions,
       count(posttags.tagid) as sc
from posts
  left outer join posttags
    on posttags.postid = posts.id and
       posttags.tagid = (select id
                         from tags
                         where tagname = 'scala')
where posts.posttypeid = 1
group by datepart(month, posts.creationdate),
         datepart(year, posts.creationdate)
order by datepart(year, posts.creationdate),
         datepart(month, posts.creationdate)

在这里试试:https ://data.stackexchange.com/stackoverflow/q/107948/

于 2011-07-30T14:42:00.287 回答