0

具体来说,这些表是:文章、作者、authors_articles(连接作者和文章)、subjectareas(作者的主题区域)和 authors_subjectareas(将作者连接到他们的主题区域)。

我想从文章表中逐行读取并找到每篇文章的作者,然后转到他们的主题区域并统计该文章的所有共同作者的主题区域,最后将具有最大频率的主题区域分配给那篇文章。我编写了如下代码,但问题是它是针对所有文章而不是单独针对每篇文章进行的!

select art.name as title, art.theAbstract as abstract, sub.name as subjectArea
from
articles as art, authors as aut, subjectareas as sub, authors_articles as aa, 
authors_subjectareas as asub 
where
art.id = aa.article and aut.id = aa.author and asub.author = aut.id and
sub.id = asub.subjectArea and (art.year >= 2000 and art.year <= 2004)
group by subjectArea
Order by count(subjectArea) DESC
LIMIT 1

非常感谢您的评论...

4

1 回答 1

0

您正在寻求从物化表中获得分组最大值:

SELECT t2.name AS title, t2.theAbstract AS abstract, sub.name AS subjectArea
FROM (

  -- get each article's maximum co-author subject frequency
  SELECT art.id, MAX(freq) freq FROM (

    -- the subject frequencies of each article
    SELECT   art.id, COUNT(*) freq
    FROM     authors_articles aa
        JOIN authors_subjectareas asub USING (author)
        JOIN articles art ON art.id = aa.article
    GROUP BY art.id, asub.subjectArea

  ) t

) t1 NATURAL JOIN (

  -- the information we actually want
  SELECT   art.id, art.name, art.theAbstract, asub.subjectArea, COUNT(*) freq
  FROM     authors_articles aa
      JOIN authors_subjectareas asub USING (author)
      JOIN articles art ON art.id = aa.article
  GROUP BY art.id, asub.subjectArea

) t2 JOIN subjectareas sub ON sub.id = t2.subjectArea
于 2012-11-19T15:26:28.717 回答