3

我要回表。该函数获取一个数组(查询是'select function_name(array_agg(column_name)) from table_name')

我在下面编码:

create type pddesctype as(
    count float,
    mean float,
    std float,
    min float
);

create function pddesc(x numeric[])
returns pddesctype
as $$
    import pandas as pd
    data=pd.Series(x)
    
    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    return count, mean, std, min

$$ language plpython3u;

此代码仅导致一列上的数组。(浮动,浮动,浮动......)

我试过了

create function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    data=pd.Series(x)
    
    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    return count, mean, std, min

$$ language plpython3u;

但是有一个错误:

ERROR:  key "count" not found in mapping
HINT:  To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT:  while creating return value.

我想在不预先创建类型的情况下以列(如表格)显示结果。

如何更改 RETURN / RETURNS 语法?

4

1 回答 1

1

以下是我尝试获取包含四列的一行表格作为输出的步骤。最后一步有解决方案,第一步是重现错误的另一种方法。

检查 np.array

create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    import numpy as np
    data=pd.Series(x)

    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]
    
    ## print an INFO of the output:
    plpy.info(np.array([count, mean, std, min]))

    return np.array([count, mean, std, min])

$$ language plpython3u;

测试失败(重现问题的错误):

postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO:  [3 3 Decimal('1') 1]
ERROR:  key "count" not found in mapping
HINT:  To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT:  while creating return value
PL/Python function "pddesc"

工作解决方案:np.array([...]).reshape(1,-1)

您需要重塑数组,使其具有您想要获得的维度。在这种情况下,它是暗淡的(1 行 x 4 列),.reshape(1,-1)意味着 1 行和 -1(= 任何需要)列

create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
    import pandas as pd
    import numpy as np
    data=pd.Series(x)

    count=data.describe()[0]
    mean=data.describe()[1]
    std=data.describe()[2]
    min=data.describe()[3]

    ## print an INFO of the output:
    plpy.info(np.array([count, mean, std, min]).reshape(1,-1))

    return np.array([count, mean, std, min]).reshape(1,-1)
    ## or with the same result:
    # return np.hstack((count, mean, std, min)).reshape(1,-1)

$$ language plpython3u;

测试:

postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO:  [[3 3 Decimal('1') 1]]
 count | mean | std | min
-------+------+-----+-----
     3 |    3 |   1 |   1
(1 row)
于 2021-09-04T17:04:02.547 回答