python-3.x - 在 Alembic 迁移期间更新列内容

Question

假设我的数据库模型包含一个对象User：

Base = declarative_base() 

class User(Base):                                                               
    __tablename__ = 'users'                                                     

    id = Column(String(32), primary_key=True, default=...) 
    name = Column(Unicode(100))

我的数据库包含一个有n行的users表。在某些时候，我决定将其拆分为and ，并且在此期间我希望我的数据也能被迁移。namefirstnamelastnamealembic upgrade head

自动生成的Alembic迁移如下：

def upgrade():
    op.add_column('users', sa.Column('lastname', sa.Unicode(length=50), nullable=True))
    op.add_column('users', sa.Column('firstname', sa.Unicode(length=50), nullable=True))

    # Assuming that the two new columns have been committed and exist at
    # this point, I would like to iterate over all rows of the name column,
    # split the string, write it into the new firstname and lastname rows,
    # and once that has completed, continue to delete the name column.

    op.drop_column('users', 'name')                                             

def downgrade():
    op.add_column('users', sa.Column('name', sa.Unicode(length=100), nullable=True))

    # Do the reverse of the above.

    op.drop_column('users', 'firstname')                                        
    op.drop_column('users', 'lastname')

这个问题似乎有多种或多或少的hacky解决方案。这个和这个都建议在迁移期间使用execute()和bulk_insert()执行原始 SQL 语句。这个（不完整的）解决方案导入了当前的数据库模型，但是当模型发生变化时，这种方法很脆弱。

如何在 Alembic 迁移期间迁移和修改列数据的现有内容？推荐的方法是什么，它记录在哪里？

score 26 · Accepted Answer

norbertpy 的回答中提出的解决方案一开始听起来不错，但我认为它有一个根本缺陷：它会引入多个事务——在这些步骤之间，数据库将处于一种时髦的、不一致的状态。在我看来也很奇怪（见我的评论），一个工具会在没有数据库数据的情况下迁移数据库的模式；两者太紧密地联系在一起，无法将它们分开。

经过一番摸索和几次对话（请参阅此 Gist中的代码片段），我决定采用以下解决方案：

def upgrade():

    # Schema migration: add all the new columns.
    op.add_column('users', sa.Column('lastname', sa.Unicode(length=50), nullable=True))
    op.add_column('users', sa.Column('firstname', sa.Unicode(length=50), nullable=True))

    # Data migration: takes a few steps...
    # Declare ORM table views. Note that the view contains old and new columns!        
    t_users = sa.Table(
        'users',
        sa.MetaData(),
        sa.Column('id', sa.String(32)),
        sa.Column('name', sa.Unicode(length=100)), # Old column.
        sa.Column('lastname', sa.Unicode(length=50)), # Two new columns.
        sa.Column('firstname', sa.Unicode(length=50)),
        )
    # Use Alchemy's connection and transaction to noodle over the data.
    connection = op.get_bind()
    # Select all existing names that need migrating.
    results = connection.execute(sa.select([
        t_users.c.id,
        t_users.c.name,
        ])).fetchall()
    # Iterate over all selected data tuples.
    for id_, name in results:
        # Split the existing name into first and last.
        firstname, lastname = name.rsplit(' ', 1)
        # Update the new columns.
        connection.execute(t_users.update().where(t_users.c.id == id_).values(
            lastname=lastname,
            firstname=firstname,
            ))

    # Schema migration: drop the old column.
    op.drop_column('users', 'name')

关于此解决方案的两条评论：

如引用的 Gist 中所述，较新版本的 Alembic 的符号略有不同。
根据 DB 驱动程序，代码的行为可能会有所不同。显然，MySQL 不会将上述代码作为单个事务处理（请参阅“导致隐式提交的语句”</a>）。所以你必须检查你的数据库实现。

该downgrade()功能可以类似地实现。

附录。有关模式迁移与数据迁移配对的示例，请参阅 Alembic Cookbook 中的条件迁移元素部分。

score 4 · Accepted Answer

alembic 是一种模式迁移工具，而不是数据迁移。虽然它也可以这样使用。这就是为什么你不会找到很多关于它的文档的原因。也就是说，我会创建三个单独的修订：

添加firstname和lastname不删除name

就像在应用程序中一样读取所有用户并拆分他们的名称，然后更新first和last. 例如

for user in session.query(User).all():
    user.firstname, user.lastname = user.name.split(' ')
session.commit()

消除name

python-3.x - 在 Alembic 迁移期间更新列内容

2 回答 2

Related

Reference