4

我在 Windows 7 x64 上运行 Postgres 9.1.3 32 位。(必须使用 32 位,因为没有与 64 位 Postgres 兼容的 Windows PostGIS 版本。)(编辑:从 PostGIS 2.0 开始,它与 Windows 上的 Postgres 64 位兼容。)

我有一个查询一个表 ( consistent.master) 与一个临时表连接起来,然后将结果数据插入到第三个表 ( consistent.masternew) 中。

由于这是 a left join,因此结果表应该与查询中的左表具有相同的行数。但是,如果我运行这个:

SELECT count(*)
FROM consistent.master

我明白了2085343。但是如果我运行这个:

SELECT count(*)
FROM consistent.masternew

我明白了2085703

怎么能masternew有更多的行数master?不应该与查询中的左表masternew具有相同的行数吗?master

下面是查询。和表应该具有相同的结构mastermasternew

--temporary table created here
--I am trying to locate where multiple tickets were written on
--a single traffic stop
WITH stops AS (
    SELECT citation_id,
           rank() OVER (ORDER BY offense_timestamp,
                     defendant_dl,
                     offense_street_number,
                     offense_street_name) AS stop
    FROM   consistent.master
    WHERE  citing_jurisdiction=1
)

--Here's the insert statement. Below you'll see it's
--pulling data from a select query
INSERT INTO consistent.masternew (arrest_id,
  citation_id,
  defendant_dl,
  defendant_dl_state,
  defendant_zip,
  defendant_race,
  defendant_sex,
  defendant_dob,
  vehicle_licenseplate,
  vehicle_licenseplate_state,
  vehicle_registration_expiration_date,
  vehicle_year,
  vehicle_make,
  vehicle_model,
  vehicle_color,
  offense_timestamp,
  offense_street_number,
  offense_street_name,
  offense_crossstreet_number,
  offense_crossstreet_name,
  offense_county,
  officer_id,
  offense_code,
  speed_alleged,
  speed_limit,
  work_zone,
  school_zone,
  offense_location,
  source,
  citing_jurisdiction,
  the_geom)

--Here's the select query that the insert statement is using.    
SELECT stops.stop,
  master.citation_id,
  defendant_dl,
  defendant_dl_state,
  defendant_zip,
  defendant_race,
  defendant_sex,
  defendant_dob,
  vehicle_licenseplate,
  vehicle_licenseplate_state,
  vehicle_registration_expiration_date,
  vehicle_year,
  vehicle_make,
  vehicle_model,
  vehicle_color,
  offense_timestamp,
  offense_street_number,
  offense_street_name,
  offense_crossstreet_number,
  offense_crossstreet_name,
  offense_county,
  officer_id,
  offense_code,
  speed_alleged,
  speed_limit,
  work_zone,
  school_zone,
  offense_location,
  source,
  citing_jurisdiction,
  the_geom
FROM consistent.master LEFT JOIN stops
ON stops.citation_id = master.citation_id

万一这很重要,我已经运行了一个VACUUM FULL ANALYZE并重新索引了两个表。(不确定确切的命令;通过 pgAdmin III 完成。)

4

2 回答 2

11

左连接的行数不一定与左表中的行数相同。基本上,它就像一个普通连接,除了左表中不会出现在普通连接中的行也被添加了。因此,如果右表中有不止一行与左表中的一行匹配,则结果中的行数可能会多于左表的行数。

为了做你想做的事,你应该使用 group by 和 count 来检测倍数。

select citation_id
from stops join master on stops.citation_id = master.citation_id
group by citation_id
having count(*) > 1
于 2012-03-18T20:29:08.843 回答
4

有时你知道有多个,但不在乎。您只想获取第一个或顶部条目。
如果是这样,您可以使用SELECT DISTINCT ON

FROM consistent.master LEFT JOIN (SELECT DISTINCT ON (citation_id) * FROM stops) s
ON s.citation_id = master.citation_id

citation_id您要为每个匹配获取第一(任何)行的列在哪里。

您可能希望确保这是确定性的,并ORDER BY与其他一些可排序的列一起使用:

SELECT DISTINCT ON (citation_id) * FROM stops ORDER BY citation_id, created_at
于 2018-07-28T01:38:33.577 回答