sql - 是否可以在每个唱片标签上使用 PG 序列？

Question

PostgreSQL 9.2+ 是否提供任何功能来生成命名空间为特定值的序列？例如：

 .. | user_id | seq_id | body | ...
 ----------------------------------
  - |    4    |   1    |  "abc...."
  - |    4    |   2    |  "def...."
  - |    5    |   1    |  "ghi...."
  - |    5    |   2    |  "xyz...."
  - |    5    |   3    |  "123...."

这对于为用户生成自定义 url 很有用：

domain.me/username_4/posts/1    
domain.me/username_4/posts/2

domain.me/username_5/posts/1
domain.me/username_5/posts/2
domain.me/username_5/posts/3

我在 PG 文档（关于序列和序列函数）中没有找到任何内容来执行此操作。语句中的子查询INSERT或使用自定义 PG 函数是唯一的其他选项吗？

score 1 · Accepted Answer

您可以在@Clodoaldo 演示INSERT的语句中使用子查询。然而，这违背了序列在并发事务中使用安全的本质。此解决方案可能并且将导致竞争条件并最终重复密钥违规。

你应该重新考虑你的方法。只需为您的表格提供一个简单的序列，并将其结合起来user_id以获得您想要的排序顺序。

您始终可以使用row_number()以下简单查询生成具有所需数字的自定义网址：

SELECT format('domain.me/username_%s/posts/%s'
              ,user_id
              ,row_number() OVER (PARTITION BY user_id ORDER BY seq_id)
             )
FROM   t;

-> SQLfiddle。

score 1 · Accepted Answer

也许这个答案有点不合时宜，但我会考虑对数据进行分区user并为posts.

设置有一些开销，因为您需要触发器来管理分区的 DDL 语句，但会有效地导致每个用户拥有自己的帖子表以及他们自己的序列，从而能够处理所有也张贴为一张大桌子。

概念的一般要点...

psql# CREATE TABLE posts (user_id integer, seq_id integer);
CREATE TABLE

psql# CREATE TABLE posts_001 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# CREATE TABLE posts_002 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# select * from posts;
 user_id | seq_id 
---------+--------
       1 |      1
       1 |      2
       2 |      1
       2 |      2
(4 rows)

我在上面的设置中遗漏了一些相当重要CHECK的限制，请确保您阅读文档以了解如何使用这些设置

score 0 · Accepted Answer

insert into t values (user_id, seq_id) values
(4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4))

Check for a duplicate primary key error in the front end and retry if needed.

Update

Although @Erwin advice is sensible, that is, a single sequence with the ordering in the select query, it can be expensive.

If you don't use a sequence there is no defeat of the nature of the sequence. Also it will not result in a duplicate key violation. To demonstrate it I created a table and made a python script to insert into it. I launched 3 parallel instances of the script inserting as fast as possible. And it just works.

The table must have a primary key on those columns:

create table t (
    user_id int,
    seq_id int,
    primary key (user_id, seq_id)
);

The python script:

#!/usr/bin/env python

import psycopg2, psycopg2.extensions

query = """
    begin;
    insert into t (user_id, seq_id) values
    (4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4));
    commit;
"""

conn = psycopg2.connect('dbname=cpn user=cpn')
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE)
cursor = conn.cursor()

for i in range(0, 1000):

    while True:
        try:
            cursor.execute(query)
            break
        except psycopg2.IntegrityError, e:
            print e.pgerror
            cursor.execute("rollback;")

cursor.close()
conn.close()

After the parallel run:

select count(*), max(seq_id) from t;
 count | max  
-------+------
  3000 | 3000

Just as expected. I developed at least two applications using that logic and one of then is more than 13 years old and never failed. I concede that if you are Facebook or some other giant then you could have a problem.

score -2 · Accepted Answer

Yes:

CREATE TABLE your_table
(
    column type DEFAULT NEXTVAL(sequence_name),
    ...
);

More details here: http://www.postgresql.org/docs/9.2/static/ddl-default.html

sql - 是否可以在每个唱片标签上使用 PG 序列？

4 回答 4

Update

Related

Reference