orm - 使用负索引加速 sqlalchemy orm 动态关系切片

Question

我有以下 SQLA 模型和关系。我每秒记录每个通道的测量值，因此数据库中有很多测量值。

class Channel( Model ) :
    __tablename__   = 'channel'
    id              = Column( Integer, primary_key=True )
    #! --- Relationships ---
    measurements    = relationship( 'Measurement', back_populates='channel', lazy='dynamic' )

class Measurement( Model ) :
    __tablename__   = 'measurement'
    id              = Column( Integer, primary_key=True )
    timestamp       = Column( DateTime, nullable=False )
    value           = Column( Float, nullable=False )
    #! --- Relationships ---
    channel         = relationship( 'Channel', back_populates='measurements', uselist=False )

如果我想获得最新的测量值，我可以通过 ORM 获得它并使用负索引切片。

channel.measurements[-1]

但是，它非常非常慢！

.filter()我可以用等进一步过滤关系查询.order_by()，以获得我想要的，但我喜欢使用 ORM（为什么要不然？）

我注意到，如果我使用正索引进行切片，它会很快（类似于上面提到的显式 SQLA 查询）。

channel.measurements[0]

我改变了关系以保持measurements相反的顺序，这似乎与使用零索引一起工作。

    measurements    = relationship( 'Measurement', back_populates='channel', lazy='dynamic', order_by='Measurement.id.desc()' )

那么，为什么负索引切片这么慢？

它是 SQLAlchemy 中的错误吗？我会认为执行正确的 SQL 以仅从数据库中获取最新项目会足够聪明吗？

我还需要做些什么来按自然顺序对测量进行排序并使用负索引切片并获得与其他方法相同的速度吗？

score 0 · Accepted Answer

您没有给出任何排序，因此它必须将所有对象加载到列表中，然后获取最后一个。

如果添加echo=True参数，您可以看到查询中的差异：

对于measurements[0]，它只选择LIMIT 1与通道匹配的测量值之一 ( )：

SELECT measurement.id AS measurement_id, measurement.ts AS measurement_ts,
  measurement.value AS measurement_value,
  measurement.channel_id AS measurement_channel_id
FROM measurement
WHERE %(param_1)s = measurement.channel_id
 LIMIT %(param_2)s
{'param_1': 6, 'param_2': 1}

对于measurements[-1]，它选择与通道匹配的所有测量值。您还没有订购它，所以它必须要求数据库以它决定的任何顺序返回行（可能是主键 on measurement，但不能保证）：

SELECT measurement.id AS measurement_id, measurement.ts AS measurement_ts,  
  measurement.value AS measurement_value,
  measurement.channel_id AS measurement_channel_id
FROM measurement
WHERE %(param_1)s = measurement.channel_id
{'param_1': 6}

如果您只想要最新的测量值，请选择它并按时间戳字段排序；你可能想要索引channel_id和你的timestamp领域：

db.session.query(Measurement)\
    .filter(Measurement.channel_id == channel_id)\
    .order_by(Measurement.ts.desc())\
    .limit(1)\
    .first()

score 0 · Accepted Answer

似乎答案是 SQLA 不支持具有负索引的有效切片或关联集合。事实上，代码中似乎有一些笨拙的尝试，但由于没有经过仔细考虑，因此将从 SQLA 中删除。

https://github.com/sqlalchemy/sqlalchemy/issues/5605

我通过实现返回最新测量值的混合属性解决了我的问题，而不是直接切片关系集合。

    @hybrid_property
    def latest_measurement( self ) -> float :
        """
        Hybrid property that returns the latest measurement for the channel.
        """
        measurement = self.measurements.order_by( Measurement.id.desc() ).first()
        return measurement

orm - 使用负索引加速 sqlalchemy orm 动态关系切片

2 回答 2

Related

Reference