postgresql - 从一组表中逐个执行 SELECT

Question

我有一组按日期分区并按以下格式命名的表：

public.schedule_20121019

我可以在特定天数内成功返回这些表的列表：

SELECT 'public.schedule_' || to_char(current_date - d, 'YYYYMMDD')
FROM generate_series(6, 0, -1) s(d);

但是select * from这些表中的每一个并将结果插入新表中的好方法是什么？谢谢！

score 2 · Accepted Answer

如果它们按日期分区，则查询父表。如果要创建一个新表：

create table another_table as
select *
from schedule_parent
where the_date between current_date - 6 and current_date

如果要插入现有表：

insert into another_table
select *
from schedule_parent
where the_date between current_date - 6 and current_date

分区表有一个检查约束：

create table schedule_20121012 (
    check (date the_date = date '20012-10-12')
) inherits (schedule_parent);

因此，当您从父表中查询日期时，计划者知道要查找哪个表：

select * from schedule_parent where date the_date = date '20012-10-12'

我有一组使用继承的表。该表usuarios具有按其列之一分区的子表。它的一个孩子：

\d+ usuarios_25567
             Table "public.usuarios_25567"
 Column  |  Type   | Modifiers | Storage | Description 
---------+---------+-----------+---------+-------------
 usuario | integer | not null  | plain   | 
 data    | integer | not null  | plain   | 
 wus     | integer | not null  | plain   | 
 pontos  | real    | not null  | plain   | 
Indexes:
    "ndx_usuarios_25567" UNIQUE, btree (usuario)
Check constraints:
    "b25567" CHECK (data = 25567)
Foreign-key constraints:
    "fk_usuarios_25567" FOREIGN KEY (data) REFERENCES datas(data_serial)
Inherits: usuarios
Has OIDs: no

它的检查约束是data列。现在查看当我使用该列过滤父表上的查询时的查询计划：

explain select * from usuarios where data = 25567;
                                          QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 Result  (cost=0.00..26590.45 rows=1484997 width=16)
   ->  Append  (cost=0.00..26590.45 rows=1484997 width=16)
         ->  Seq Scan on usuarios  (cost=0.00..0.00 rows=1 width=16)
               Filter: (data = 25567)
         ->  Seq Scan on usuarios_25567 usuarios  (cost=0.00..26590.45 rows=1484996 width=16)
               Filter: (data = 25567)
(6 rows)

它只会查看该表。不是其他数百张桌子。

score 1 · Accepted Answer

有关分区的信息，请参阅详细文档。有丰富的信息，然后一些。您将使用表继承来进行分区。

两个警告：确保分区确实解决了问题。它适用于删除旧数据和查询日期范围。实际上，单独的分区应该至少有数百万行才能使其保持稳定。

另一个警告：在当前状态下，Postgresql 的分区适用于数十个表。数百似乎有点牵强。考虑每月分区，而不是每天。

score 1 · Accepted Answer

我想了一个办法来解决它：

CREATE OR REPLACE FUNCTION looper(_schema varchar, _partition varchar,
                                  _traceback integer, _table varchar)
RETURNS VOID AS $$
DECLARE row RECORD;
BEGIN
    FOR row IN
        SELECT table_schema
            , table_name
        FROM
            information_schema.tables
        WHERE
            table_type = 'BASE TABLE'
        AND
            table_schema = _schema
        AND
            table_name IN (
                SELECT _partition || to_char(current_date - d, 'YYYYMMDD')
                FROM
                    generate_series(_traceback, 0, -1) s(d)
            )
        ORDER BY table_name
    LOOP
        EXECUTE 'INSERT INTO ' || _table || ' SELECT schedule_date FROM ' ||
            quote_ident(row.table_schema) || '.' ||
            quote_ident(row.table_name);
    END LOOP;
END;
$$ LANGUAGE plpgsql VOLATILE;

postgresql - 从一组表中逐个执行 SELECT

3 回答 3

Related

Reference