0

我对 postgres 很陌生,并且遇到了一些非常困难的事情,我非常需要。此外,我没有使用合适的编辑器,它是某种形式的基于网络的编辑器。请考虑一下。

这是我的查询:

select coalesce('user') as user_src,
       coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
       count (*) as nohits
from $log
where coalesce(root_domain(hostname), hostname, 'unknown') in
    (select coalesce(root_domain(hostname), hostname, 'unknown') as web_domain
          from $log
          group by web_domain
          limit 10
    ) 
group by user_src, web_domain
order by user_src, web_domain, nohits desc

但结果看起来不像我希望它们看起来的样子。我想拥有所有用户 + 他们的前 10 个网站。现在我看到所有用户和总共10 个网站,分为所有用户。-> 一些用户只有 0 个,因为他们从未访问过前 10 个用户之一。

谢谢你调查它!

编辑:这就是我如何转换它(不工作 - 这个错误:错误:列“主机名”不存在)

select  coalesce('user') as user_src,
        coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
        count (*) as nohits
from
    (select coalesce('user') as user_src,
            coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
            count (*) as nohits,
            rank() over (partition by coalesce('user') order by coalesce('user'), count (*) desc) as rank
    from $log
    group by user_src, web_domain) w
where rank <= 2
order by user_src, rank

这会起作用,例如:(只是为了确保“主机名”存在)

select  coalesce('user') as user_src,
        coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
        count (*) as nohits
from $log
group by user_src, web_domain
order by user_src, nohits
4

1 回答 1

1

您发布的查询无法按“用户”显示细分,因为“coalesce('user')”部分是单个实体。对您有用的是 PostgreSQL 的Window Functions之一。我将演示一个使用 RANK() 的简单示例,以获取特定用户的前 N ​​个。

begin;

drop table if exists weblog;
create table weblog (
"user"    int,
url     text
);

insert into weblog values
(1,'http://www.1.com'),
(1,'http://www.1.com'),
(1,'http://www.2.com'),
(1,'http://www.2.com'),
(1,'http://www.3.com'),
(1,'http://www.4.com'),
(1,'http://www.5.com'),
(1,'http://www.6.com'),

(2,'http://www.2.com'),
(2,'http://www.2.com'),
(2,'http://www.3.com'),
(2,'http://www.4.com'),
(2,'http://www.4.com'),
(2,'http://www.4.com'),
(2,'http://www.5.com'),
(2,'http://www.6.com');


select  "user",
        url,
        hits,
        rank
from    (select "user",
                url,
                count(*) as hits,
                rank() over (partition by "user" order by count(*) desc,url) as rank
        from weblog
        group by "user",url) w
where rank <= 2
order by "user",rank;

 user |       url        | hits | rank 
------+------------------+------+------
    1 | http://www.1.com |    2 |    1
    1 | http://www.2.com |    2 |    2
    2 | http://www.4.com |    3 |    1
    2 | http://www.2.com |    2 |    2


rollback;

希望这对你有用。


[在OP编辑答案后:]

您的外部查询应该只是从内部查询中提取列,而不是重做相同的步骤。尝试以下操作(来自您最近的编辑)

select  user_src,
        web_domain,
        nohits
from
    (select coalesce('user') as user_src,
            coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
            count (*) as nohits,
            rank() over (partition by coalesce('user') order by coalesce('user'), count (*) desc) as rank
    from $log
    group by user_src, web_domain) w
where rank <= 2
order by user_src, rank
于 2013-07-02T16:18:35.730 回答