19

我一直在使用以下函数来制作“更具可读性”(据称)的格式,用于从 Oracle 获取数据。这是功能:

def rows_to_dict_list(cursor):
    """ 
    Create a list, each item contains a dictionary outlined like so:
    { "col1_name" : col1_data }
    Each item in the list is technically one row of data with named columns,
    represented as a dictionary object
    For example:
    list = [
        {"col1":1234567, "col2":1234, "col3":123456, "col4":BLAH},
        {"col1":7654321, "col2":1234, "col3":123456, "col4":BLAH}
    ]
    """

    # Get all the column names of the query.
    # Each column name corresponds to the row index
    # 
    # cursor.description returns a list of tuples, 
    # with the 0th item in the tuple being the actual column name.
    # everything after i[0] is just misc Oracle info (e.g. datatype, size)
    columns = [i[0] for i in cursor.description]

    new_list = []
    for row in cursor:
        row_dict = dict()
        for col in columns:
            # Create a new dictionary with field names as the key, 
            # row data as the value.
            #
            # Then add this dictionary to the new_list
            row_dict[col] = row[columns.index(col)]

        new_list.append(row_dict)
    return new_list

然后我会使用这样的功能:

sql = "Some kind of SQL statement"
curs.execute(sql)
data = rows_to_dict_list(curs)
#
for row in data:
    item1 = row["col1"]
    item2 = row["col2"]
    # Do stuff with item1, item2, etc...
    # You don't necessarily have to assign them to variables,
    # but you get the idea.

虽然这似乎在不同程度的压力下表现得相当好,但我想知道是否有更有效或“pythonic”的方式来做到这一点。

4

6 回答 6

28

还有其他改进需要做,但这真的让我大吃一惊:

    for col in columns:
        # Create a new dictionary with field names as the key, 
        # row data as the value.
        #
        # Then add this dictionary to the new_list
        row_dict[col] = row[columns.index(col)]

除了低效之外,index在这种情况下使用还容易出错,至少在相同项目可能在列表中出现两次的情况下。改用enumerate

    for i, col in enumerate(columns):
        # Create a new dictionary with field names as the key, 
        # row data as the value.
        #
        # Then add this dictionary to the new_list
        row_dict[col] = row[i]

但那是小土豆,真的。这是这个函数的一个更紧凑的版本:

def rows_to_dict_list(cursor):
    columns = [i[0] for i in cursor.description]
    return [dict(zip(columns, row)) for row in cursor]

让我知道这是否有效。

于 2012-05-04T20:53:25.743 回答
10

为了避免预先转储列表中所有内容的内存使用的干净方法,您可以将游标包装在生成器函数中:

def rows_as_dicts(cursor):
    """ returns cx_Oracle rows as dicts """
    colnames = [i[0] for i in cursor.description]
    for row in cursor:
        yield dict(zip(colnames, row))

然后使用如下 - 来自游标的行在迭代时被转换为字典:

for row in rows_as_dicts(cursor):
    item1 = row["col1"]
    item2 = row["col2"]
于 2013-08-29T20:09:13.133 回答
4

您不应该将 dict 用于大型结果集,因为内存使用量会很大。我经常使用 cx_Oracle,但没有一个很好的字典游标困扰着我,无法为它编写一个模块。我还必须将 Python 连接到许多不同的数据库,所以我以一种可以与任何 DB API 2 连接器一起使用的方式进行操作。

它依赖于 PyPi DBMS - 数据库变得更简单

>>> import dbms
>>> db = dbms.OraConnect('myUser', 'myPass', 'myInstance')
>>> cur = db.cursor()
>>> cur.execute('SELECT * FROM people WHERE id = :id', {'id': 1123})
>>> row = cur.fetchone()
>>> row['last_name']
Bailey
>>> row.last_name
Bailey
>>> row[3]
Bailey
>>> row[0:4]
[1123, 'Scott', 'R', 'Bailey']
于 2013-08-08T21:48:34.237 回答
0

假设光标“Cursor”已经定义并且急于去:

byCol = {cl:i for i,(cl,type, a, b, c,d,e) in enumerate(Cursor.description)}

那么你可以去:

for row in Cursor: column_of_interest = row[byCol["COLUMN_NAME_OF_INTEREST"]]

不像系统自己处理它那样干净和流畅,但并不可怕。

于 2016-10-05T01:29:02.917 回答
0

创建一个字典

cols=dict()
for col, desc in enumerate(cur.description):
    cols[desc[0]] = col

访问:

for result in cur
    print (result[cols['COL_NAME']])
于 2017-01-24T00:41:09.020 回答
0

我有一个更好的:

import cx_Oracle

def makedict(cursor):
"""Convert cx_oracle query result to be a dictionary   
"""
cols = [d[0] for d in cursor.description]

def createrow(*args):
    return dict(zip(cols, args))

return createrow

db = cx_Oracle.connect('user', 'pw', 'host')
cursor = db.cursor()
rs = cursor.execute('SELECT * FROM Tablename')
cursor.rowfactory = makedict(cursor)
于 2017-05-26T04:31:13.327 回答