1

完整的错误信息是:

ERROR: invalid input syntax for integer: "1e+06"
SQL state: 22P02
Context: In PL/R function sample

我正在使用的查询是:

WITH a as
(
 SELECT a.tract_id_alias,
     array_agg(a.pgid ORDER BY a.pgid) as pgids,
     array_agg(a.sample_weight_geo ORDER BY a.pgid) as block_weights
 FROM results_20161109.block_microdata_res_joined a
 WHERE a.tract_id_alias in (66772, 66773, 66785, 66802, 66805, 66806, 66813)
 AND a.bldg_count_res > 0 
 GROUP BY a.tract_id_alias

)
SELECT NULL::INTEGER agent_id, 
     a.tract_id_alias,
     b.year,
    unnest(shared.sample(a.pgids, 
                       b.n_agents, 
                       1 * b.year, 
                       True, 
                       a.block_weights)
                       ) as pgid
FROM a
LEFT JOIN results_20161109.initial_agent_count_by_tract_res_11 b
ON a.tract_id_alias = b.tract_id_alias
ORDER BY b.year, a.tract_id_alias, pgid;

shared.sample我正在使用的功能是:

CREATE OR REPLACE FUNCTION shared.sample(ids bigint[], size integer, seed integer DEFAULT 1, with_replacement boolean DEFAULT false, probabilities numeric[] DEFAULT NULL::numeric[])
  RETURNS integer[] AS
$BODY$
    set.seed(seed)
    if (length(ids) == 1) {
        s = rep(ids,size)
    } else {
        s = sample(ids,size, with_replacement,probabilities)
    }
    return(s)
$BODY$
  LANGUAGE plr VOLATILE
  COST 100;
ALTER FUNCTION shared.sample(bigint[], integer, integer, boolean, numeric[])
  OWNER TO "server-superusers";

我对这些东西很陌生,所以任何帮助将不胜感激。

4

1 回答 1

5

不是功能的问题。就像错误消息说的那样:字符串'1e+06'不能转换为integer.

显然,表n_agents中的列results_20161109.initial_agent_count_by_tract_res_11不是integer列。可能输入textor varchar?(该信息将对您的问题有所帮助。)

无论哪种方式,赋值转换都不适用于目标类型integer。但它适用于numeric

不工作:

SELECT '1e+06'::text::int;  -- error as in question

作品:

SELECT '1e+06'::text::numeric::int;

如果我的假设成立,您可以将其用作垫脚石。在您的查询中
替换为.b.n_agentsb.n_agents::numeric::int

数字保持在整数范围内是您的责任,否则您会遇到下一个异常。


如果那没有解决问题,您需要查看函数重载

函数类型解析

模式搜索路径在许多相关情况下都是相关的,但您确实对所有对象进行了模式限定,因此我们可以排除这种情况。

您的查询通常看起来不错。我看了看,只发现了一些小的改进:

SELECT NULL::int AS agent_id  -- never omit the AS keyword for column alias
     , a.tract_id_alias
     , b.year
     , s.pgid
FROM  (
   SELECT tract_id_alias
        , array_agg(pgid)              AS pgids
        , array_agg(sample_weight_geo) AS block_weights
   FROM  (  -- use a subquery, cheaper than CTE
      SELECT tract_id_alias
           , pgid
           , sample_weight_geo
      FROM   results_20161109.block_microdata_res_joined
      WHERE  tract_id_alias IN (66772, 66773, 66785, 66802, 66805, 66806, 66813)
      AND    bldg_count_res > 0
      ORDER  BY pgid  -- sort once in a subquery. cheaper.
      ) sub
   GROUP  BY 1
   ) a
LEFT   JOIN results_20161109.initial_agent_count_by_tract_res_11 b USING (tract_id_alias)
LEFT   JOIN LATERAL
   unnest(shared.sample(a.pgids
                      , b.n_agents
                      , b.year  -- why "1 * b.year"?
                      , true
                      , a.block_weights)) s(pgid) ON true
ORDER  BY b.year, a.tract_id_alias, s.pgid;
于 2016-11-09T18:01:12.003 回答