我有一个要在 SQLAlchemy 中表示的星型架构数据库。现在我有一个问题是如何以最好的方式做到这一点。现在我有很多带有自定义连接条件的属性,因为数据存储在不同的表中。如果可以为不同的事实表重新使用维度,那就太好了,但我还没有弄清楚如何才能很好地做到这一点。
问问题
2417 次
1 回答
20
星型模式中的典型事实表包含对所有维度表的外键引用,因此通常不需要自定义连接条件 - 它们是根据外键引用自动确定的。
例如,具有两个事实表的星型模式如下所示:
Base = declarative_meta()
class Store(Base):
__tablename__ = 'store'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50), nullable=False)
class Product(Base):
__tablename__ = 'product'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50), nullable=False)
class FactOne(Base):
__tablename__ = 'sales_fact_one'
store_id = Column('store_id', Integer, ForeignKey('store.id'), primary_key=True)
product_id = Column('product_id', Integer, ForeignKey('product.id'), primary_key=True)
units_sold = Column('units_sold', Integer, nullable=False)
store = relation(Store)
product = relation(Product)
class FactTwo(Base):
__tablename__ = 'sales_fact_two'
store_id = Column('store_id', Integer, ForeignKey('store.id'), primary_key=True)
product_id = Column('product_id', Integer, ForeignKey('product.id'), primary_key=True)
units_sold = Column('units_sold', Integer, nullable=False)
store = relation(Store)
product = relation(Product)
但是假设您无论如何都想减少样板文件。我将创建维度类的本地生成器,这些生成器在事实表上进行自我配置:
class Store(Base):
__tablename__ = 'store'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50), nullable=False)
@classmethod
def add_dimension(cls, target):
target.store_id = Column('store_id', Integer, ForeignKey('store.id'), primary_key=True)
target.store = relation(cls)
在这种情况下,用法如下:
class FactOne(Base):
...
Store.add_dimension(FactOne)
但是,这样做有问题。假设您要添加的维度列是主键列,映射器配置将失败,因为类需要在设置映射之前设置其主键。因此,假设我们使用的是声明式(您将在下面看到它有很好的效果),为了使这种方法起作用,我们必须使用instrument_declarative()
函数而不是标准元类:
meta = MetaData()
registry = {}
def register_cls(*cls):
for c in cls:
instrument_declarative(c, registry, meta)
因此,我们将按照以下方式做一些事情:
class Store(object):
# ...
class FactOne(object):
__tablename__ = 'sales_fact_one'
Store.add_dimension(FactOne)
register_cls(Store, FactOne)
如果您确实有充分的理由使用自定义连接条件,只要这些条件的创建方式有某种模式,您就可以使用以下方式生成add_dimension()
:
class Store(object):
...
@classmethod
def add_dimension(cls, target):
target.store_id = Column('store_id', Integer, ForeignKey('store.id'), primary_key=True)
target.store = relation(cls, primaryjoin=target.store_id==cls.id)
但是如果你在 2.6 上,最后一件很酷的事情就是add_dimension
变成一个类装饰器。这是一个清理所有内容的示例:
from sqlalchemy import *
from sqlalchemy.ext.declarative import instrument_declarative
from sqlalchemy.orm import *
class BaseMeta(type):
classes = set()
def __init__(cls, classname, bases, dict_):
klass = type.__init__(cls, classname, bases, dict_)
if 'metadata' not in dict_:
BaseMeta.classes.add(cls)
return klass
class Base(object):
__metaclass__ = BaseMeta
metadata = MetaData()
def __init__(self, **kw):
for k in kw:
setattr(self, k, kw[k])
@classmethod
def configure(cls, *klasses):
registry = {}
for c in BaseMeta.classes:
instrument_declarative(c, registry, cls.metadata)
class Store(Base):
__tablename__ = 'store'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50), nullable=False)
@classmethod
def dimension(cls, target):
target.store_id = Column('store_id', Integer, ForeignKey('store.id'), primary_key=True)
target.store = relation(cls)
return target
class Product(Base):
__tablename__ = 'product'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50), nullable=False)
@classmethod
def dimension(cls, target):
target.product_id = Column('product_id', Integer, ForeignKey('product.id'), primary_key=True)
target.product = relation(cls)
return target
@Store.dimension
@Product.dimension
class FactOne(Base):
__tablename__ = 'sales_fact_one'
units_sold = Column('units_sold', Integer, nullable=False)
@Store.dimension
@Product.dimension
class FactTwo(Base):
__tablename__ = 'sales_fact_two'
units_sold = Column('units_sold', Integer, nullable=False)
Base.configure()
if __name__ == '__main__':
engine = create_engine('sqlite://', echo=True)
Base.metadata.create_all(engine)
sess = sessionmaker(engine)()
sess.add(FactOne(store=Store(name='s1'), product=Product(name='p1'), units_sold=27))
sess.commit()
于 2009-09-19T23:03:54.197 回答