0

我有以下查询。

SELECT a.link_field1 AS journo, count(a.link_id) as articles, AVG( b.vote_value ) AS score FROM dan_links a LEFT JOIN dan_votes b ON link_id = vote_link_id WHERE link_field1 <> '' and link_status NOT IN ('discard', 'spam', 'page') GROUP BY link_field1 ORDER BY link_field1, link_id

此查询返回列表中第一项的计数 3。应该返回的是

Journo | count | score
John S | 2 | 6.00
Joe B | 1 | 4

但是对于第一个 John S,它返回的计数为 3。

如果我直接查询

select * from dan_links where link_field1 = 'John S' 

如我所料,我得到 2 条记录返回。我一生都无法弄清楚为什么计数是错误的,除非由于某种原因它正在计算 dan_vote 表中的记录

我怎样才能得到正确的计数,或者我的查询完全错误?

编辑:表格的内容

dan_links

link_id | link_field1 | link | source | link_status
1 | John S | http://test.com | test.com | approved
2 | John S | http://google.com | google | approved
3 | Joe B | http://facebook.com | facebook | approved

dan_votes

vote_id | link_id | vote_value
1 | 1 | 5
2 | 1 | 8
3 | 2 | 4
4 | 3 | 1

编辑:看起来它出于某种原因正在计算投票表中的行数

4

1 回答 1

0

当您使用条件 link_id = vote_link_id 为每个匹配记录进行左外连接时,会创建一行,例如

link_id | link_field1 | link | source | link_status|vote_id|vote_value
1 | John S | http://test.com | test.com | approved|1|5
1 | John S | http://test.com | test.com | approved|2|8
2 | John S | http://google.com | google | approved|3|4
3 | Joe B | http://facebook.com | facebook | approved|4|1

现在,当您在 link_field1 上进行分组时,John S 的计数为 3

嵌套查询可能有效

SELECT journo,count(linkid) as articles,AVG(score) FROM
(SELECT a.link_field1 AS journo, AVG( b.vote_value ) AS score, a.link_id as linkid 
FROM dan_links a 
LEFT JOIN dan_votes b 
ON link_id = vote_link_id 
WHERE link_field1 <> '' 
and link_status NOT IN ('discard', 'spam', 'page') 
GROUP BY link_id 
ORDER BY link_field1, link_id) GROUP BY journo

上面的查询将给出不正确的平均值为((n1+n2)/2+n3)/2 != (n1+n2+n3)/3,所以使用下面的查询

SELECT journo,count(linkid) as articles, SUM(vote_sum)/SUM(count(linkid)) 
FROM
    (SELECT a.link_field1 AS journo, SUM( b.vote_value ) AS vote_sum, a.link_id as linkid, count(a.link_id) as count_on_id
    FROM dan_links a 
    LEFT JOIN dan_votes b 
    ON link_id = vote_link_id 
    WHERE link_field1 <> '' 
    and link_status NOT IN ('discard', 'spam', 'page') 
    GROUP BY link_id 
ORDER BY link_field1, link_id) GROUP BY journo

希望这可以帮助。

于 2012-06-05T03:02:10.037 回答