2

我有一个从两个内部连接表中选择的 sql 查询。select 语句的执行大约需要 50 秒。但是, fetchall() 需要 788 秒,并且只获取 981 个结果。这是查询和 fetchall 代码:

time0 = time.time()
self.cursor.execute("SELECT spectrum_id, feature_table_id "+
                    "FROM spectrum AS s "+
                    "INNER JOIN feature AS f "+
                    "ON f.msrun_msrun_id = s.msrun_msrun_id "+
                    "INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax "+
                                 "FROM convexhull GROUP BY feature_feature_table_id) AS t "+
                    "ON t.feature_feature_table_id = f.feature_table_id "+
                    "WHERE s.msrun_msrun_id = ? "+
                    "AND s.scan_start_time >= t.rtMin "+
                    "AND s.scan_start_time <= t.rtMax "+
                    "AND base_peak_mz >= t.mzMin "+
                    "AND base_peak_mz <= t.mzMax", spectrumFeature_InputValues)
print 'query took:',time.time()-time0,'seconds'

time0 = time.time()
spectrumAndFeature_ids = self.cursor.fetchall()      
print time.time()-time0,'seconds since to fetchall'

fetchall 需要这么长时间有什么原因吗?


更新

正在做:

while 1:
    info = self.cursor.fetchone()
    if info:
        <do something>
    else:
        break

速度和速度一样慢

allInfo = self.cursor.fetchall()         
for info in allInfo:
    <do something>
4

1 回答 1

4

由于对象的 设置为 1 ,默认情况下fetchall()与循环一样慢。fetchone()arraysizeCursor

为了加快速度,您可以循环fetchmany(),但要查看性能提升,您需要为其提供大于 1 的大小参数,否则它将按批次获取“许多” arraysize,即 1。

您很可能只需通过提高 的值即可获得性能提升arraysize,但我没有这样做的经验,因此您可能想先通过以下方式进行试验:

>>> import sqlite3
>>> conn = sqlite3.connect(":memory:")
>>> cu = conn.cursor()
>>> cu.arraysize
1
>>> cu.arraysize = 10
>>> cu.arraysize
10

更多关于上述内容:http: //docs.python.org/library/sqlite3.html#sqlite3.Cursor.fetchmany

于 2012-04-26T20:50:53.780 回答