1

我正在尝试将时间序列数据从 传递Pythonq/kdb+

一种解决方案是qPythonmoduleq ,提供从表格/字典到 Pandas的无缝转换。

问题是当试图Pandas 传递到时, (在列中)q的时间索引并没有完全放在一边。可重现的代码:DataFrameDateq

import pandas.io.data as web
import datetime
import numpy
import qpython.qconnection as qconnection # requires installation of qPython module from https://github.com/exxeleron/qPython

start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2015, 2, 6)
f=web.DataReader("F", 'yahoo', start, end) # download Ford stock data (ticker "F") from Yahoo Finance web service
f.ix[:5]  # explore first 5 rows of the DataFrame
# Out:
#             Open  High  Low  Close    Volume  Adj Close
#    Date
# 2010-01-04 10.17 10.28 10.05 10.28  60855800       9.43 
# 2010-01-05 10.45 11.24 10.40 10.96 215620200      10.05
# 2010-01-06 11.21 11.46 11.13 11.37 200070600      10.43
# 2010-01-07 11.46 11.69 11.32 11.66 130201700      10.69
# 2010-01-08 11.67 11.74 11.46 11.69 130463000      10.72

q = qconnection.QConnection(host = 'localhost', port = 5000, pandas = True) # define connection interface parameters. Assumes we have previously started q server on port 5000 with `q.exe -p 5000` command
q.open() # open connection
q('set', numpy.string_('yahoo'), f) # pass DataFrame to q table named `yahoo`
q('5#yahoo') # display top 5 rows from newly created table on q server 
# Out:
#    Open  High  Low  Close    Volume  Adj Close
# 0 10.17 10.28 10.05 10.28  60855800       9.43 
# 1 10.45 11.24 10.40 10.96 215620200      10.05
# 2 11.21 11.46 11.13 11.37 200070600      10.43
# 3 11.46 11.69 11.32 11.66 130201700      10.69
# 4 11.67 11.74 11.46 11.69 130463000      10.72

如您所见, q 表没有DataFrameDate中存在的列作为索引。f

如何有效地(对于大数据)将日期时间索引传递给 q?

4

2 回答 2

2

在序列化DataFrame对象时qPython检查meta属性的存在。如果该属性不存在,DataFrame则序列化为 q 表,并在此过程中跳过索引列。如果要保留索引列,则必须设置meta属性并提供类型提示以强制表示 aq 键控表。

请看一下修改后的示例:

import pandas.io.data as web
import datetime
import numpy
import qpython.qconnection as qconnection # requires installation of qPython module from https://github.com/exxeleron/qPython

from qpython import MetaData
from qpython.qtype import QKEYED_TABLE


start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2015, 2, 6)
f=web.DataReader("F", 'yahoo', start, end) # download Ford stock data (ticker "F") from Yahoo Finance web service
f.ix[:5]  # explore first 5 rows of the DataFrame
# Out:
#             Open  High  Low  Close    Volume  Adj Close
#    Date
# 2010-01-04 10.17 10.28 10.05 10.28  60855800       9.43 
# 2010-01-05 10.45 11.24 10.40 10.96 215620200      10.05
# 2010-01-06 11.21 11.46 11.13 11.37 200070600      10.43
# 2010-01-07 11.46 11.69 11.32 11.66 130201700      10.69
# 2010-01-08 11.67 11.74 11.46 11.69 130463000      10.72

q = qconnection.QConnection(host = 'localhost', port = 5000, pandas = True) # define connection interface parameters. Assumes we have previously started q server on port 5000 with `q.exe -p 5000` command
q.open() # open connection
f.meta = MetaData(**{'qtype': QKEYED_TABLE}) # enforce to serialize DataFrame as keyed table
q('set', numpy.string_('yahoo'), f) # pass DataFrame to q table named `yahoo`
q('5#yahoo') # display top 5 rows from newly created table on q server 
# Out:
#              Open   High    Low  Close     Volume  Adj Close
# Date                                                         
# 2010-01-04  10.17  10.28  10.05  10.28   60855800       9.43
# 2010-01-05  10.45  11.24  10.40  10.96  215620200      10.05
# 2010-01-06  11.21  11.46  11.13  11.37  200070600      10.43
# 2010-01-07  11.46  11.69  11.32  11.66  130201700      10.69
# 2010-01-08  11.67  11.74  11.46  11.69  130463000      10.72
于 2015-02-07T23:13:43.603 回答
0

您是否考虑过尝试其他 API 之一? http://www.timestored.com/kdb-guides/python-api 我也会在他们的 github 页面上报告这个问题。

于 2015-02-07T18:51:27.480 回答