3

我正在使用 Postgresql 13。

通过这个查询,PostgreSQL 正在使用索引:

SELECT *
FROM
    "players"
WHERE team_id = 3
    AND (
    code ILIKE 'lushij'
    OR
    REPLACE(lastname||firstname,' ','') ILIKE '%lushij%'
    OR REPLACE(firstname||lastname,' ','') ILIKE '%lushij%'
    OR personal_info->>'houses' ILIKE '%lushij%'
    )
LIMIT 15
Limit  (cost=333.01..385.77 rows=15 width=360)
  ->  Bitmap Heap Scan on players  (cost=333.01..4061.29 rows=1060 width=360)
        Recheck Cond: ((code ~~* 'lushij'::text) OR (replace((lastname || firstname), ' '::text, ''::text) ~~* '%lushij%'::text) OR (replace((firstname || lastname), ' '::text, ''::text) ~~* '%lushij%'::text) OR ((personal_info ->> 'houses'::text) ~~* '%lushij%'::text))
        Filter: (team_id = 3)
        ->  BitmapOr  (cost=333.01..333.01 rows=1060 width=0)
              ->  Bitmap Index Scan on players_code_trgm  (cost=0.00..116.75 rows=100 width=0)
                    Index Cond: (code ~~* 'lushij'::text)
              ->  Bitmap Index Scan on players_replace_last_first_name_trgm  (cost=0.00..66.40 rows=320 width=0)
                    Index Cond: (replace((lastname || firstname), ' '::text, ''::text) ~~* '%lushij%'::text)
              ->  Bitmap Index Scan on players_replace_first_last_name_trgm  (cost=0.00..66.40 rows=320 width=0)
                    Index Cond: (replace((firstname || lastname), ' '::text, ''::text) ~~* '%lushij%'::text)
              ->  Bitmap Index Scan on players_personal_info_houses_trgm_idx  (cost=0.00..82.40 rows=320 width=0)
                    Index Cond: ((personal_info ->> 'houses'::text) ~~* '%lushij%'::text)

使用相同的查询,但搜索文本少一个字符(从lushijlushi,不使用索引

SELECT *
FROM
    "players"
WHERE team_id = 3
    AND (
    code ILIKE 'lushi'
    OR
    REPLACE(lastname||firstname,' ','') ILIKE '%lushi%'
    OR REPLACE(firstname||lastname,' ','') ILIKE '%lushi%'
    OR personal_info->>'houses' ILIKE '%lushi%'
    )
LIMIT 15
Limit  (cost=0.00..235.65 rows=15 width=360)
  ->  Seq Scan on players  (cost=0.00..76853.53 rows=4892 width=360)
        Filter: ((team_id = 3) AND ((code ~~* 'lushi'::text) OR (replace((lastname || firstname), ' '::text, ''::text) ~~* '%lushi%'::text) OR (replace((firstname || lastname), ' '::text, ''::text) ~~* '%lushi%'::text) OR ((personal_info ->> 'houses'::text) ~~* '%lushi%'::text)))

为什么?

更新

如果我评论LIMIT 15行,则使用索引。


这里的结构:

球员表结构
-- ----------------------------
-- Table structure for players
-- ----------------------------
DROP TABLE IF EXISTS "public"."players";
CREATE TABLE "public"."players" (
  "id" int8 NOT NULL DEFAULT nextval('players_id_seq'::regclass),
  "created_at" timestamptz(6) NOT NULL DEFAULT now(),
  "updated_at" timestamptz(6),
  "team_id" int8 NOT NULL,
  "firstname" text COLLATE "pg_catalog"."default",
  "lastname" text COLLATE "pg_catalog"."default",
  "code" text COLLATE "pg_catalog"."default",
  "personal_info" jsonb
)
;

-- ----------------------------
-- Indexes structure for table players
-- ----------------------------
CREATE INDEX "players_personal_info_houses_trgm_idx" ON "public"."players" USING gin (
  (personal_info ->> 'houses'::text) COLLATE "pg_catalog"."default" "public"."gin_trgm_ops"
);
CREATE INDEX "players_code_trgm" ON "public"."players" USING gin (
  "code" COLLATE "pg_catalog"."default" "public"."gin_trgm_ops"
);
CREATE INDEX "players_lower_code" ON "public"."players" USING btree (
  lower(code) COLLATE "pg_catalog"."default" "pg_catalog"."text_ops" ASC NULLS LAST
);
CREATE INDEX "players_replace_first_last_name_trgm" ON "public"."players" USING gin (
  replace(firstname || lastname, ' '::text, ''::text) COLLATE "pg_catalog"."default" "public"."gin_trgm_ops"
);
CREATE INDEX "players_replace_last_first_name_trgm" ON "public"."players" USING gin (
  replace(lastname || firstname, ' '::text, ''::text) COLLATE "pg_catalog"."default" "public"."gin_trgm_ops"
);

-- ----------------------------
-- Primary Key structure for table players
-- ----------------------------
ALTER TABLE "public"."players" ADD CONSTRAINT "players_pkey" PRIMARY KEY ("id");

-- ----------------------------
-- Foreign Keys structure for table players
-- ----------------------------
ALTER TABLE "public"."players" ADD CONSTRAINT "players_team_id_fkey" FOREIGN KEY ("team_id") REFERENCES "public"."teams" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION;
4

2 回答 2

1

好的..这是基于我对 SQL Server 和 SQL 的一般知识,但它可能也适用于这里。

首先...因为您正在执行 a SELECT *,所以它需要在某个时候转到聚集索引。

使用非聚集索引(如果使用)是为了识别相关的行,然后它会一一挑选出这些行(嵌套循环连接,或有时称为索引查找/扫描 +键查找)。

如果行太多,这实际上是低效的——你最终会做更多的读取/等,而不仅仅是读取整个表。

减少 LIKE 过滤器的长度会增加基数估计,例如,增加过滤器在查询计划器/优化器中预期匹配的行数。

我猜 SQL 引擎会进行猜测(包括索引/数据的统计信息),并确定从聚集索引中读取所有数据可能更有效,而不是确定行并逐一读取它们。


OP 更新重新删除限制后更新。

嗯......再一次,这取决于它根据过滤器估计存在多少行。

想象一下,如果您在原始查询中执行 ILIKE '%e%' 。每隔一行可能与此匹配。由于您没有排序,它只需要读取(例如)聚集索引的前 30 行,它就会得到您的答案。再一次,查询规划器/优化器可能会得出这样的结论:这将是获得这些的最有效方式。

但是,如果没有限制,它将需要读取所有行才能获得所有结果。

  • 对于 %e% 来说,只进行一次完整的聚集索引扫描可能更有效,因为它期望许多行匹配
  • 对于更复杂/选择性的过滤,首先搜索索引(然后直接搜索聚集索引中的数据)通常更有效
于 2020-10-11T21:53:31.173 回答
1

字符串越短,您的条件就越不具有选择性。根据其估计,PostgreSQL 认为对于短字符串,有足够的行匹配条件,即按顺序获取行并丢弃不匹配的行,直到找到 15 个匹配行更便宜。

许多OR条件很可能使优化器低估了选择性,因为这些条件被认为是不相关的,但情况可能并非如此。

于 2020-10-12T05:32:07.327 回答