0

我使用查询整个 postgres 表

c.execute("select * from train_temp")
trans=np.array(c.fetchall())

在预期的数据中,我得到了一行列名。

trans[-1,]
Out[63]: 
array(['ACTION', 'RESOURCE', 'MGR_ID', 'ROLE_ROLLUP_1', 'ROLE_ROLLUP_2',
       'ROLE_DEPTNAME', 'ROLE_TITLE', 'ROLE_FAMILY_DESC', 'ROLE_FAMILY',
       'ROLE_CODE', None, None, None, None, None, None, None, None, None], dtype=object)

更令人费解的是返回的行数与表中的行数相匹配

trans.shape
Out[67]: (32770, 19)



select count(1) from train_temp ;
 count 
-------
 32770
(1 row)

这是表的架构

                         Table "public.train_temp"
       Column        |       Type       | Modifiers | Storage  | Description 
---------------------+------------------+-----------+----------+-------------
 action              | text             |           | extended | 
 resource            | text             |           | extended | 
 mgr_id              | text             |           | extended | 
 role_rollup_1       | text             |           | extended | 
 role_rollup_2       | text             |           | extended | 
 role_deptname       | text             |           | extended | 
 role_title          | text             |           | extended | 
 role_family_desc    | text             |           | extended | 
 role_family         | text             |           | extended | 
 role_code           | text             |           | extended | 
 av_role_code        | double precision |           | plain    | 
 av_role_family      | double precision |           | plain    | 
 av_role_family_desc | double precision |           | plain    | 
 av_role_title       | double precision |           | plain    | 
 av_role_deptname    | double precision |           | plain    | 
 av_role_rollup_2    | double precision |           | plain    | 
 av_role_rollup_1    | double precision |           | plain    | 
 av_mgr_id           | double precision |           | plain    | 
 av_resource         | double precision |           | plain    | 
Has OIDs: no

这里发生了什么?请注意,并非所有表都发生这种情况。实际上,对于最后一个,该过程运行良好

 Table "public.play"
  Column   |       Type       | Modifiers | Storage  | Description 
-----------+------------------+-----------+----------+-------------
 row.names | text             |           | extended | 
 action    | double precision |           | plain    | 
 color     | text             |           | extended | 
 type      | text             |           | extended | 
Has OIDs: no

最后一张表完全作为字符串传递,而有问题的表则尊重数据类型。

play[1,]
Out[73]: 
array(['2', '0.0', 'blue', 'car'], 
      dtype='|S5')


trans[1,]
Out[74]: 
array(['1', '0', '36', '117961', '118413', '119968', '118321', '117906',
       '290919', '118322', 0.920412992041299, 0.942349726775956,
       0.933439675174014, 0.920412992041299, 0.976, 0.964478764478764,
       0.949222217031812, 0.909090909090909, 0.923076923076923], dtype=object)

感谢您的洞察力。

4

1 回答 1

0

实际上,我只是在将 *csv 导入 postgres 时自己编写了标题。

我应该在这种情况下使用该header选项psql

\copy test from 'test.csv' with (delimiter ',' , format csv, header TRUE);
于 2013-06-20T16:39:37.167 回答