0

我在 postgres 的交叉表函数中发现了一个我无法解释的奇怪行为,但希望其他人可以......

我正在使用的交叉表功能的版本需要首先构建一个初步表。

此 SQL 成功创建了初步表:

SELECT 
    ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well AS "rowname",
    M.marker AS "bucket", 
    G.allele1||' '||G.allele2 AS "bucket_value" 
INTO TABLE ct 
FROM 
    geno.gmarkers M, 
    geno.genotypes G, 
    geno.gsamples S, 
    geno.guploads U, 
    geno.gibg_studies ST 
WHERE 
    G.markers_id=M.id 
    AND G.gsamples_id=S.id 
    AND S.guploads_id=U.id 
    AND U.ibg_study_id=ST.id 
    AND ( M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3) 
    AND ( S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015') 
ORDER BY box,well;

产生如下输出:

         rowname          |  bucket   | bucket_value 
--------------------------+-----------+--------------
 LTS CL100001 10011 1 A01 | 5HTTLPR-T | S La
 LTS CL100001 10011 1 A01 | 5HTTLPR-D | 14 16
 LTS CL100001 10011 1 A01 | DAT1      | 440 480
 LTS CL100001 10011 1 A01 | DRD4      | 475 475
 LTS CL100001 10011 1 A01 | Caspi     | 351 351
 LTS CL100009 10420 1 A02 | Caspi     |  
 LTS CL100009 10420 1 A02 | 5HTTLPR-T | La Lg
 LTS CL100009 10420 1 A02 | 5HTTLPR-D | 16 16
 LTS CL100009 10420 1 A02 | DAT1      | 440 480
 LTS CL100009 10420 1 A02 | DRD4      | 475 475
...

但是,如果我尝试包含一个全部为空的日期列,如:

SELECT 
    ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well||' '||G.run_date::text AS "rowname", 
    M.marker AS "bucket", 
    G.allele1||' '||G.allele2 AS "bucket_value" 
INTO TABLE ct 
FROM 
    geno.gmarkers M, 
    geno.genotypes G, 
    geno.gsamples S, 
    geno.guploads U, 
    geno.gibg_studies ST 
WHERE 
    G.markers_id=M.id 
    AND G.gsamples_id=S.id 
    AND S.guploads_id=U.id 
    AND U.ibg_study_id=ST.id 
    AND ( M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3) 
    AND ( S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015') 
ORDER BY box,well;

这将产生输出:

 rowname |  bucket   | bucket_value 
---------+-----------+--------------
         | 5HTTLPR-T | S La
         | 5HTTLPR-D | 14 16
         | DAT1      | 440 480
         | DRD4      | 475 475
         | Caspi     | 351 351
         | Caspi     |  
         | 5HTTLPR-T | La Lg
         | 5HTTLPR-D | 16 16

如您所见,将 run_date 列添加到“rowname”复合列的末尾会呈现整个复合空白......这太疯狂了。如果我用虚拟数据填充 run_date,它会显示出来......但如果它是空白或 null,这会导致“行名”变为空白。

我无法判断这是否是 postgres 中的错误,但如果可能的话,我想解决这个奇怪的结果。

TIA,瑞克斯特

4

2 回答 2

0

您应该将其null视为一种unknown价值。null值不是数字或字符串,因此您不能像操作它们一样对它们进行操作。因此,您应该确保使用一些会返回非空值的函数,例如coalesce()从左到右返回第一个非空参数并强制将默认值作为最右边的参数。

于 2012-04-04T00:25:44.657 回答
0
|| coalesce(G.run_date, '')::text
于 2012-04-04T00:00:09.163 回答