1

我开始编写这个查询,我发现很难理解为什么应该关闭这个问题。

select
   TOP ##Limit:int?38369## -- The maximum value the hardware can handle.
   Posts.Id as [Post Link], -- Question title.
   Count(PendingFlags.PostId) as [Number of pending flags], -- Number of pending flags per questions.
   Posts.OwnerUserId as [User Link], -- Let click on the colum to see if the same user ask off-topic questions often.
   Reputation as [User Reputation], -- Interesting to see that such questions are sometimes asked by high rep users.
   Posts.Score as [Votes], -- Interesting to see that some questions have more than 100 upvotes.
   Posts.AnswerCount as [Number of Answers], -- I thought we shouldn't answer on off-  topic post.
   Posts.FavoriteCount as [Number of Stars], -- Some questions seems to be very helpfull :) .
   Posts.CreationDate as [Asked on], -- The older is the question, the more is the chance that flags on them can't get reviewed.
   Posts.LastActivityDate as [last activity], -- Similar effect as with Posts.CreationDate.
   Posts.LastEditDate as [modified on],
   Posts.ViewCount
from posts
   LEFT OUTER JOIN Users on Users.id = posts.OwnerUserId
   INNER JOIN PendingFlags on PendingFlags.PostId = Posts.Id
where ClosedDate IS NULL -- The question is not closed.
group by Posts.id, Posts.OwnerUserId, Reputation, Posts.Score, Posts.FavoriteCount, Posts.AnswerCount, Posts.CreationDate, Posts.LastActivityDate, Posts.LastEditDate, Posts.ViewCount
order by Count(PendingFlags.PostId) desc; -- Questions with more flags have more chance to get them handled, and the higher is the probabilty that the question is off-topic (since several users already reviewed the question).

鉴于每个问题都有几个标志,我不能使用一个简单的表格来显示标志用于每个标志的原因,但我认为应该与每个帖子最常见的 CloseReasonTypes.Id 值相关:这个导致我遇到两个问题:

  • 首先:查看此查询后,我应该将CloseReasonTypes加入PendingFlags以显示原因名称而不是其编号。由于PostsPendingFlags之间没有公共字段,但是当我from posts用作连接表的基础时,我不知道如何执行此JOIN

  • 第二:我不知道在每一行上选择最常用的关闭原因。虽然有几个问题似乎已经讨论过类似的情况,但我不能使用他们的答案,因为他们询问如何在整个表上找到最常见的值,从而产生一个单列单行的表,而我需要这样做是为了计算每个帖子上的标志。

4

1 回答 1

1

虽然不完全是您要查找的内容,但我相信此查询将为您提供一个良好的开端。

select
    PostId as [Post Link], 
    duplicate = sum(case when closereasontypeid = 101 then 1 else 0 end), 
    offtopic = sum(case when closereasontypeid = 102 then 1 else 0 end),
    unclear = sum(case when closereasontypeid = 103 then 1 else 0 end),
    toobroad = sum(case when closereasontypeid = 104 then 1 else 0 end),
    opinion = sum(case when closereasontypeid = 105 then 1 else 0 end),
    ot_superuser = sum(case when CloseAsOffTopicReasonTypeId = 4 then 1 else 0 end),
    ot_findexternal = sum(case when CloseAsOffTopicReasonTypeId = 8 then 1 else 0 end),
    ot_serverfault = sum(case when CloseAsOffTopicReasonTypeId = 7 then 1 else 0 end),
    ot_lackinfo = sum(case when CloseAsOffTopicReasonTypeId = 12 then 1 else 0 end),
    ot_typo = sum(case when CloseAsOffTopicReasonTypeId = 11 then 1 else 0 end)
from pendingflags
where 
    flagtypeid in (13,14)   -- Close flags
    and creationdate > '2014-04-15'
group by PostId

这仅查看自今年 4 月 15 日以来已关闭的帖子,并返回约 23,500 条记录。

我相信数据浏览器不包含已删除的帖子,因此这些不包含在结果中。

如果/当添加或删除新的关闭原因时,这将需要修改。

于 2014-05-12T13:49:58.507 回答