7

使用 Python 和 Sqlalchemy 将相同的值作为布尔值或整数存储在 sqlite 数据库中会产生以下结果。

Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 62.5009999275 secs
SqlAlchemy Core: Total time for 40000 records 56.0600001812 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 5.72099995613 secs
SqlAlchemy Core: Total time for 40000 records 0.770999908447 secs

为什么在使用布尔类型时会出现这样的性能问题?

我知道 SQLite 没有布尔类型的概念,而是将它们存储为整数 1(真)或 0(假)。我会假设 SqlAlchemy 只是将 python bool 映射到 Sqlite 整数。

用于生成上述输出的脚本(从这个问题修改):

import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine, Boolean
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())

class CustomerInteger(Base):
    __tablename__ = "customerInteger"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Integer)

class CustomerBoolean(Base):
    __tablename__ = "customerBoolean"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Boolean)

def init_sqlalchemy(dbname = 'sqlite:///sqlalchemy.db'):
    global engine
    engine = create_engine(dbname, echo=False)
    DBSession.remove()
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)

def test_sqlalchemy_orm(n, table):
    init_sqlalchemy()
    t0 = time.time()
    for i in range(n):
        customer = table()
        customer.name = 'NAME ' + str(i)
        customer.value = True
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print "SqlAlchemy ORM: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


def test_sqlalchemy_core(n, table):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        table.__table__.insert(),
        [{"name":'NAME ' + str(i), "value":True } for i in range(n)]
    )
    print "SqlAlchemy Core: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


if __name__ == '__main__':

    print "Value stored as Boolean:"
    test_sqlalchemy_orm(40000, CustomerBoolean)
    test_sqlalchemy_core(40000, CustomerBoolean)

    print "Value stored as Integer:"
    test_sqlalchemy_orm(40000, CustomerInteger)
    test_sqlalchemy_core(40000, CustomerInteger)
4

1 回答 1

3

我对三种配置进行了测试。虽然 Boolean 和 Integer 在运行时间上存在差异,但不是 10 倍。可能您想尝试切换到另一个 python 版本。

PS。我在装有 Windows 8 的 Core i5 M430 CPU 机器上运行我的测试。

另外我建议运行分析器以查看 sqlalchemy 在您的系统上运行时在哪里花费了这么多时间。

1)

python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.7.8
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.84400010109 secs
SqlAlchemy Core: Total time for 40000 records 0.725000143051 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.0680000782 secs
SqlAlchemy Core: Total time for 40000 records 0.443000078201 secs

2)

python: 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 9.69299983978 secs
SqlAlchemy Core: Total time for 40000 records 0.572000026703 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 9.35899996758 secs
SqlAlchemy Core: Total time for 40000 records 0.40700006485 secs

3)

python: 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.531000137329102 secs
SqlAlchemy Core: Total time for 40000 records 0.7139999866485596 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.023000001907349 secs
SqlAlchemy Core: Total time for 40000 records 0.44099998474121094 secs
于 2013-06-29T20:02:45.650 回答