3

我正在玩 postgresql 9.3 的 hstore。我正在尝试使用和索引 hstore 列,就像文档状态一样。我的问题是索引似乎没有被使用。让我给你举个例子:

我创建了一个表“人”:

=# CREATE TABLE Person (Id BIGSERIAL PRIMARY KEY NOT NULL, Values hstore);

并插入一个测试值:

=# INSERT INTO Person (Values, 'a=>1,b=>3');

然后,如果我解释一个在“值”列上使用运算符“@>”的 SELECT 查询,我不出所料地得到:

=# EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1');
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on person p  (cost=0.00..24.50 rows=1 width=40)
   Filter: ("values" @> '"a"=>"1"'::hstore)

无索引<->顺序扫描。说得通。无论如何,我创建 GIN 或 GIST 索引都没有关系,解释一直在谈论顺序扫描:

=# CREATE INDEX IX_GIN_VALUES ON Person USING GIN (values);
CREATE INDEX

=# EXPLAIN SELECT P.* FROM Person P WHERE P.values @> hstore('a', '1');

                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on person p  (cost=0.00..1.01 rows=1 width=246)
   Filter: ("values" @> '"age"=>"2"'::hstore)

也许我错过了一些明显的东西?

4

2 回答 2

7

如果您只是在玩它,请务必添加足够的数据以使索引扫描有意义。如果您只有几行,或者如果很多行包含相似的值(即您的 where 条件没有足够的选择性),seq 扫描通常会比索引扫描快。

此外,请务必analyze在填写您的测试数据后放入您的表格。


@maxm 的一些额外阅读:

(自从写了后者以来,性能有了很大的提高。)

为什么不使用他/她的索引?

因为 Postgres seq 扫描整个表(只有一行)并从单个磁盘页面中过滤出该行比进行索引查找更快,然后同样 seq 扫描表以检索行的数据。

提问者如何创建索引是否存在问题?

没有,但请参阅上面的链接,了解何时最好使用标准化数据。

并且更喜欢json 或 jsonb而不是 hstore。

查询 hstore 列?需要修复什么以便 SELECT 查询使用这样的索引?

没什么,但请再次查看上面的链接,了解何时最好使用标准化数据。

于 2013-12-15T10:49:39.453 回答
3

简而言之:当表中的页面很少时,Postgres 的规划器更喜欢跳过索引而只加载和扫描行。

CREATE SCHEMA stackoverflow20589058;
--- CREATE SCHEMA

SET search_path TO stackoverflow20589058,"$user",public;
--- SET

CREATE EXTENSION hstore;
--- CREATE EXTENSION

CREATE TABLE Person (Id BIGSERIAL PRIMARY KEY NOT NULL, Values hstore);
--- CREATE TABLE

WITH Vals(n) AS (SELECT * FROM generate_series(1,10))
INSERT INTO Person (
  SELECT n AS Id, hstore('a=>'||n||', b=>'||n) AS Values FROM Vals
);
--- INSERT 0 10

EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1');
---                         QUERY PLAN                        
--- ----------------------------------------------------------
---  Seq Scan on person p  (cost=0.00..24.50 rows=1 width=40)
---    Filter: ("values" @> '"a"=>"1"'::hstore)
--- (2 rows)

CREATE INDEX IX_GIN_VALUES ON Person USING GIN (values);
--- CREATE INDEX

------------------------- When there are few values, a sequential scan is
------------------------- often the best search strategy. Grabbing a few
------------------------- pages in sequence can be cheaper than making an
------------------------- extra disk seek to load the index.
EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1');
---                        QUERY PLAN                        
--- ---------------------------------------------------------
---  Seq Scan on person p  (cost=0.00..1.12 rows=1 width=40)
---    Filter: ("values" @> '"a"=>"1"'::hstore)
--- (2 rows)

TRUNCATE Person;
--- TRUNCATE TABLE

WITH Vals(n) AS (SELECT * FROM generate_series(1,100000))
INSERT INTO Person (
  SELECT n AS Id, hstore('a=>'||n||', b=>'||n) AS Values FROM Vals
);
--- INSERT 0 100000

------------------------- When there are many rows, using the index can
------------------------- allow us to skip quite a lot of I/O; so
------------------------- Postgres's planner makes use of the index.
EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1');
---                                    QUERY PLAN                                   
--- --------------------------------------------------------------------------------
---  Bitmap Heap Scan on person p  (cost=916.83..1224.56 rows=107 width=40)
---    Recheck Cond: ("values" @> '"a"=>"1"'::hstore)
---    ->  Bitmap Index Scan on ix_gin_values  (cost=0.00..916.80 rows=107 width=0)
---          Index Cond: ("values" @> '"a"=>"1"'::hstore)
--- (4 rows)

DROP SCHEMA stackoverflow20589058 CASCADE;
--- NOTICE:  drop cascades to 2 other objects
--- DETAIL:  drop cascades to extension hstore
--- drop cascades to table person
--- DROP SCHEMA
于 2015-02-15T06:16:02.870 回答