由于数据量大,我需要扩展我们的应用程序数据库。它在 PostgreSQL 9.3 上。所以,我找到了 PostgreSQL-XL,它看起来很棒,但是我很难理解分布式表的限制。通过复制(在每个数据节点中复制整个表)来分发它们是非常好的,但是假设我有两个需要沿着数据节点“分片”的大相关表:
CREATE TABLE foos
(
id bigserial NOT NULL,
project_id integer NOT NULL,
template_id integer NOT NULL,
batch_id integer,
dataset_id integer NOT NULL,
name text NOT NULL,
CONSTRAINT pk_foos PRIMARY KEY (id),
CONSTRAINT fk_foos_batch_id FOREIGN KEY (batch_id)
REFERENCES batches (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_dataset_id FOREIGN KEY (dataset_id)
REFERENCES datasets (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_project_id FOREIGN KEY (project_id)
REFERENCES projects (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_foos_template_id FOREIGN KEY (template_id)
REFERENCES templates (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT uc_foos UNIQUE (project_id, name)
);
CREATE TABLE foo_childs
(
id bigserial NOT NULL,
foo_id bigint NOT NULL,
template_id integer NOT NULL,
batch_id integer,
ffdata hstore,
CONSTRAINT pk_ff_foos PRIMARY KEY (id),
CONSTRAINT fk_fffoos_batch_id FOREIGN KEY (batch_id)
REFERENCES batches (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_fffoos_foo_id FOREIGN KEY (foo_id)
REFERENCES foos (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_fffoos_template_id FOREIGN KEY (template_id)
REFERENCES templates (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
);
现在 Postgres-XL 文档指出:
- “(...)在分布式表中,UNIQUE 约束必须包括表的分布列”
- “(...)分布列必须包含在 PRIMARY KEY 中”
- “(...) 带有 REFERENCES (FK) 的列应该是分布列。(...) PRIMARY KEY 也必须是分布列。”
他们的例子过于简单化和伤痕累累,所以有人可以使用DISTRIBUTE BY HASH()将上面两个表用于 postgres-XL吗?
或者也许建议其他方式来扩展?