5

在之前的程序中,我从 csv 文件中读取数据,如下所示:

AllData = np.genfromtxt(open("PSECSkew.csv", "rb"),
                        delimiter=',',
                        dtype=[('CalibrationDate', datetime),('Expiry', datetime), ('B0', float), ('B1', float), ('B2', float), ('ATMAdjustment', float)],
                        converters={0: ConvertToDate, 1: ConvertToDate})

我现在正在编写一个非常相似的程序,但这次我想获得一个非常相似的数据结构AllData(除了这次浮点数都在 csv 字符串中),但来自 SQL Server 而不是 csv 文件。最好的方法是什么?

pyodbc看起来它涉及到大量使用游标,这是我不熟悉并希望避免的。我只想运行查询并以上述结构(或 C# 中的 DataTable)获取数据。

4

4 回答 4

4

这是一个最小的示例,基于您链接到的另一个问题

import pyodbc
import numpy

conn = pyodbc.connect('DRIVER={SQL Server};SERVER=MyServer;Trusted_Connection=yes;')
cur = conn.cursor()
cur.execute('select object_id from sys.objects')
results = cur.fetchall()
results_as_list = [i[0] for i in results]
array = numpy.fromiter(results_as_list, dtype=numpy.int32)
print array
于 2013-05-08T13:10:50.937 回答
3

与此同时,还有更好的方法。查看turbodbc包。要将结果集转换为 NumPy 数组的 OrderedDict,只需执行以下操作:

import turbodbc
connection = turbodbc.connect(dsn="My data source name")
cursor = connection.cursor()
cursor.execute("SELECT 42")
results = cursor.fetchallnumpy()

它也应该比 pyodbc 快得多(取决于您的数据库,因子 10 是绝对可能的)。

于 2016-08-01T06:34:04.500 回答
2

使用熊猫怎么样?例如:

import psycopg2
import pandas

try :
    con = psycopg2.connect(
    host = "host",
    database = "innovate",
    user = "username",
    password = "password")
except:
    print "Could not connect to database."

data = pandas.read_sql_query("SELECT * FROM table", con)
于 2017-11-14T10:46:05.810 回答
0

最后,我只是使用了 pyodbc 并遍历了游标/结果集,通过大量试验和错误将每个结果放入手动构造的结构化数组中。如果有更直接的方法,我全神贯注!

import numpy as np
import pyodbc as SQL
from datetime import datetime


cxn = SQL.connect('Driver={SQL Server};Server=myServer; Database=myDB; UID=myUserName; PWD=myPassword')
c = cxn.cursor()

#Work out how many rows the query returns in order to initialise the structured array with the correct number of rows
num_rows = c.execute('SELECT count(*) FROM PSECSkew').fetchone()[0]

#Create the structured array
AllData = np.zeros(num_rows, dtype=[('CalibrationDate', datetime),('Expiry', datetime), ('B0', float), ('B1', float), ('B2', float), ('ATMAdjustment', float)])

ConvertToDate = lambda s:datetime.strptime(s,"%Y-%m-%d")

#iterate using the cursor and fill the structred array.
r = 0
for row in c.execute('SELECT * FROM PSECSkew ORDER BY CalibrationDate, Expiry'):
    AllData[r] = (ConvertToDate(row[0]), ConvertToDate(row[1])) + row[2:] #Note if you don't need manipulate the data (i.e. to convert the dates in my case) then just tuple(row) would have sufficed
    r = r + 1
于 2013-05-08T15:39:56.540 回答