0

我想分析 PostgreSQL (9.1) 中的整数数组列。使用intarray此处的文档)我能够计算:

  • 数组长度
  • 最小值
  • 最大值
  • 独特元素的数量

我的查询是:

    select 
        (array_length(string_to_array(num_partition,' ')::int[], 1))::smallint as part_len,
        icount(uniq(sort(string_to_array(num_partition,' ')::int[])))::smallint as part_unq,
        ((sort(string_to_array(num_partition,' ')::int[],'desc'))[1])::smallint as part_max,
        ((sort(string_to_array(num_partition,' ')::int[]))[1])::smallint as part_min    
    from 
        tmp.npart 

现在我想计算任何不相等元素之间的最小差异。例子:

Array [1,5,5,10]
Expected result: 4 (because of 5-1 equals to 4)

我想我可以使用以下方法计算:

  1. 获取数组的唯一元素
  2. 排序数组
  3. 对数组中的每个元素减去 A[I] - A[I+1]
  4. 在步骤 3 中获取最大结果

例子:

Input: [7,9,12,20,25,1,1,20,25]
1) Unique [1,7,9,12,20,25]
2) Sort (desc): [25,20,12,9,7,1] 
3) Diff A[i] - A[i+1]: [5,8,3,2,6]
4) Min: 2

有什么简单的方法可以做到这一点吗?我需要在一个有 150 000 000 行的表上计算这个。

示例数据(或sqlfiddle):

create table tmp (intarr int[]);

insert into tmp values (ARRAY[1,1,3,6,9,25]);
insert into tmp values (ARRAY[10,20,30,50]);
insert into tmp values (ARRAY[1,4,8,15,21]);
insert into tmp values (ARRAY[1]);
insert into tmp values (ARRAY[1,1,1,1,9,9,9,9,20,20,20]);
4

2 回答 2

1

遍历数组的函数:

create or replace function array_min_diff(a int[])
returns int as
$$
declare
    min_diff int = null;
    i int = 2;
begin
    select array_agg(e order by e)
    from (
        select distinct e
        from unnest(a) s(e)
    ) s
    into a;

    loop
        min_diff = least(min_diff, a[i] - a[i - 1]);
        i := i + 1;
        exit when i > array_upper(a, 1);
    end loop;

    return min_diff;
end;
$$ language plpgsql immutable
于 2013-02-28T18:25:53.453 回答
1

SQL小提琴

select intarr, min(diff) min_diff
from (
    select
        intarr,
        i - lag(i) over(partition by intarr order by i) diff
    from (
        select distinct intarr, unnest(intarr) i
        from tmp
    ) s
) s
group by intarr
于 2013-02-28T15:43:09.147 回答