2

我正在阅读这样的 CSV 文件

Date,Open,High,Low,Close,Volume,Adj Close
2000-12-29,30.88,31.31,28.69,29.06,31702200,27.57
2000-12-28,30.56,31.62,30.38,31.06,25053600,29.46
2000-12-27,30.38,31.06,29.38,30.69,26437500,29.11
2000-12-26,31.50,32.19,30.00,30.94,20589500,29.34
2000-12-22,30.38,31.98,30.00,31.88,35568200,30.23
2000-12-21,27.81,30.25,27.31,29.50,46719700,27.98
2000-12-20,28.06,29.81,27.50,28.50,54440500,27.03
2000-12-19,31.81,33.12,30.12,30.62,58653700,29.05
...
2000-01-13,108.50,109.88,103.50,105.06,55779200,24.91
2000-01-12,112.25,112.25,103.69,105.62,83443600,25.05
2000-01-11,112.62,114.75,109.50,112.38,86585200,26.65
2000-01-10,108.00,116.00,105.50,115.75,91518000,27.45
2000-01-07,95.00,103.50,93.56,103.38,91755600,24.51
2000-01-06,100.16,105.00,94.69,96.00,109880000,22.76
2000-01-05,101.62,106.38,96.00,102.00,166054000,24.19
2000-01-04,115.50,118.62,105.00,107.69,116824800,25.54
2000-01-03,124.62,125.19,111.62,118.12,98114800,28.01

可以使用下载完整数据

python -c "from pyalgotrade.tools import yahoofinance; yahoofinance.download_daily_bars('orcl', 2000, 'orcl-2000.csv')"

http://gbeced.github.io/pyalgotrade/docs/v0.15/html/tutorial.html

read_csv我尝试使用 Python、Pandas、SQLAlchemy和以下方法将此 CSV 数据放入 MySQL 数据库to_sql

filename = "orcl-2000.csv"
df = pd.read_csv(filename, sep=',')
db_uri = "mysql+mysqlconnector://{user}:{password}@{host}:{port}/{db}" # or without mysqlconnector (need MySQLdb)
db_uri = db_uri.format(
    user = "root",
    password = "123456",
    host = "127.0.0.1",
    db = "test",
    port = 3306
)
engine = sqlalchemy.create_engine(db_uri)
df["Date"] = pd.to_datetime(df["Date"])
df = df.set_index("Date")

print(df)
print(df.dtypes)
print(type(df.index), df.index.dtype)
print(type(df.index[0]))

df.to_sql("test_table", engine, flavor="mysql", if_exists="replace")

在这里查看完整代码

我得到以下输出:

$ python main.py
          Date    Open    High     Low   Close     Volume  Adj Close
0   2000-12-29   30.88   31.31   28.69   29.06   31702200      27.57
1   2000-12-28   30.56   31.62   30.38   31.06   25053600      29.46
2   2000-12-27   30.38   31.06   29.38   30.69   26437500      29.11
3   2000-12-26   31.50   32.19   30.00   30.94   20589500      29.34
4   2000-12-22   30.38   31.98   30.00   31.88   35568200      30.23
5   2000-12-21   27.81   30.25   27.31   29.50   46719700      27.98
6   2000-12-20   28.06   29.81   27.50   28.50   54440500      27.03
7   2000-12-19   31.81   33.12   30.12   30.62   58653700      29.05
8   2000-12-18   30.00   32.44   29.94   32.00   61640100      30.35
9   2000-12-15   29.44   30.08   28.19   28.56  120004000      27.09
10  2000-12-14   29.25   29.94   27.25   27.50   45894400      26.08
11  2000-12-13   31.94   32.00   28.25   28.38   37933600      26.91
12  2000-12-12   31.88   32.50   30.41   30.75   26481200      29.17
13  2000-12-11   30.50   32.25   30.00   31.94   50279700      30.29
14  2000-12-08   30.06   30.62   29.25   30.06   40052600      28.51
15  2000-12-07   29.62   29.94   28.12   28.31   41088300      26.85
16  2000-12-06   31.19   31.62   29.31   30.19   42125600      28.63
17  2000-12-05   29.44   31.50   28.88   31.50   59754700      29.88
18  2000-12-04   26.25   28.88   26.19   28.19   40710400      26.74
19  2000-12-01   26.38   27.88   25.50   26.44   48663500      25.08
20  2000-11-30   21.75   27.62   21.50   26.50   84386200      25.14
21  2000-11-29   23.19   23.62   21.81   22.88   75409600      21.70
22  2000-11-28   23.50   23.81   22.25   22.66   43075300      21.49
23  2000-11-27   25.44   25.81   22.88   23.12   45665200      21.93
24  2000-11-24   23.31   24.25   23.12   24.12   22443900      22.88
25  2000-11-22   23.62   24.06   22.06   22.31   53315300      21.16
26  2000-11-21   24.81   25.62   23.50   23.88   58647400      22.65
27  2000-11-20   24.31   25.88   24.00   24.75   89778400      23.48
28  2000-11-17   26.94   29.25   25.25   28.81   59636000      27.33
29  2000-11-16   28.75   29.81   27.25   27.38   37986600      25.96
..         ...     ...     ...     ...     ...        ...        ...
222 2000-02-14   60.88   62.25   58.62   62.19   37599800      29.49
223 2000-02-11   62.50   64.75   58.75   59.69   55774000      28.31
224 2000-02-10   60.00   62.62   58.00   62.31   45288600      29.55
225 2000-02-09   60.06   61.31   58.81   59.94   52471600      28.43
226 2000-02-08   60.75   61.44   59.00   59.56   55718000      28.25
227 2000-02-07   59.31   60.00   58.88   59.94   44691200      28.43
228 2000-02-04   57.62   58.25   56.81   57.81   40916000      27.42
229 2000-02-03   55.38   57.00   54.25   56.69   55533200      26.88
230 2000-02-02   54.94   56.00   54.00   54.31   63933000      25.76
231 2000-02-01   51.25   54.31   50.00   54.00   57105600      25.61
232 2000-01-31   47.94   50.12   47.06   49.95   68148000      23.69
233 2000-01-28   51.50   51.94   46.62   47.38   86394000      22.47
234 2000-01-27   55.81   56.69   50.00   51.81   61054000      24.57
235 2000-01-26   56.75   58.94   55.00   55.06   47569200      26.11
236 2000-01-25   55.06   57.50   54.88   56.44   53059200      26.77
237 2000-01-24   60.25   60.38   54.00   54.19   50022400      25.70
238 2000-01-21   61.50   61.50   59.00   59.69   50891000      28.31
239 2000-01-20   59.00   60.25   58.12   59.25   54526800      28.10
240 2000-01-19   56.12   58.25   54.00   57.12   49198400      27.09
241 2000-01-18  107.88  114.50  105.62  111.25   66780000      26.38
242 2000-01-14  109.00  111.38  104.75  106.81   57078000      25.33
243 2000-01-13  108.50  109.88  103.50  105.06   55779200      24.91
244 2000-01-12  112.25  112.25  103.69  105.62   83443600      25.05
245 2000-01-11  112.62  114.75  109.50  112.38   86585200      26.65
246 2000-01-10  108.00  116.00  105.50  115.75   91518000      27.45
247 2000-01-07   95.00  103.50   93.56  103.38   91755600      24.51
248 2000-01-06  100.16  105.00   94.69   96.00  109880000      22.76
249 2000-01-05  101.62  106.38   96.00  102.00  166054000      24.19
250 2000-01-04  115.50  118.62  105.00  107.69  116824800      25.54
251 2000-01-03  124.62  125.19  111.62  118.12   98114800      28.01

[252 rows x 7 columns]
Date         datetime64[ns]
Open                float64
High                float64
Low                 float64
Close               float64
Volume                int64
Adj Close           float64
dtype: object
(<class 'pandas.tseries.index.DatetimeIndex'>, dtype('<M8[ns]'))
<class 'pandas.tslib.Timestamp'>
Traceback (most recent call last):
  File "main.py", line 28, in <module>
    main()
  File "main.py", line 25, in main
    df.to_sql("test_table", engine, flavor="mysql", if_exists="replace")
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 950, in to_sql
    index_label=index_label)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 475, in to_sql
    index_label=index_label)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 842, in to_sql
    table.insert()
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 611, in insert
    self.pd_sql.execute(ins, data_list)
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 810, in execute
    return self.engine.execute(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1614, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 662, in execute
    params)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 874, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1024, in _handle_dbapi_exception
    exc_info
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 196, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 856, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 321, in do_executemany
    cursor.executemany(statement, parameters)
  File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 557, in executemany
    values.append(fmt % self._process_params(params))
  File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 344, in _process_params
    return self._process_params_dict(params)
  File "/usr/lib/python2.7/dist-packages/mysql/connector/cursor.py", line 335, in _process_params_dict
    "Failed processing pyformat-parameters; %s" % err)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) Failed processing pyformat-parameters; 'MySQLConverter' object has no attribute '_timestamp_to_mysql' u'INSERT INTO test_table (`index`, `Date`, `Open`, `High`, `Low`, `Close`, `Volume`, `Adj Close`) VALUES (%(index)s, %(Date)s, %(Open)s, %(High)s, %(Low)s, %(Close)s, %(Volume)s, %(Adj Close)s)' ({'index': 0, 'High': 31.31, 'Adj Close': 27.57, 'Volume': 31702200, 'Low': 28.69, 'Date': Timestamp('2000-12-29 00:00:00'), 'Close': 29.06, 'Open': 30.88}, {'index': 1, 'High': 31.62, 'Adj Close': 29.46, 'Volume': 25053600, 'Low': 30.38, 'Date': Timestamp('2000-12-28 00:00:00'), 'Close': 31.06, 'Open': 30.56}, {'index': 2, 'High': 31.06, 'Adj Close': 29.11, 'Volume': 26437500, 'Low': 29.38, 'Date': Timestamp('2000-12-27 00:00:00'), 'Close': 30.69, 'Open': 30.38}, {'index': 3, 'High': 32.19, 'Adj Close': 29.34, 'Volume': 20589500, 'Low': 30.0, 'Date': Timestamp('2000-12-26 00:00:00'), 'Close': 30.94, 'Open': 31.5}, {'index': 4, 'High': 31.98, 'Adj Close': 30.23, 'Volume': 35568200, 'Low': 30.0, 'Date': Timestamp('2000-12-22 00:00:00'), 'Close': 31.88, 'Open': 30.38}, {'index': 5, 'High': 30.25, 'Adj Close': 27.98, 'Volume': 46719700, 'Low': 27.31, 'Date': Timestamp('2000-12-21 00:00:00'), 'Close': 29.5, 'Open': 27.81}, {'index': 6, 'High': 29.81, 'Adj Close': 27.03, 'Volume': 54440500, 'Low': 27.5, 'Date': Timestamp('2000-12-20 00:00:00'), 'Close': 28.5, 'Open': 28.06}, {'index': 7, 'High': 33.12, 'Adj Close': 29.05, 'Volume': 58653700, 'Low': 30.12, 'Date': Timestamp('2000-12-19 00:00:00'), 'Close': 30.62, 'Open': 31.81}  ... displaying 10 of 252 total bound parameter sets ...  {'index': 250, 'High': 118.62, 'Adj Close': 25.54, 'Volume': 116824800, 'Low': 105.0, 'Date': Timestamp('2000-01-04 00:00:00'), 'Close': 107.69, 'Open': 115.5}, {'index': 251, 'High': 125.19, 'Adj Close': 28.01, 'Volume': 98114800, 'Low': 111.62, 'Date': Timestamp('2000-01-03 00:00:00'), 'Close': 118.12, 'Open': 124.62})

Date类型为datetime64[ns]. SQLAlchemy 似乎不喜欢这种 Numpy 类型,所以它提出了:

sqlalchemy.exc.ProgrammingError: (ProgrammingError) Failed processing pyformat-parameters; 'MySQLConverter' object has no attribute '_timestamp_to_mysql'

我怎样才能完全避免这种错误?

4

2 回答 2

2

我对 MySQL 连接器不是很熟悉,但根据这个,您应该能够使用类似的东西添加 datetime64 转换器

class Datetime64Converter(mysql.connector.conversion.MySQLConverter):
    """ A mysql.connector Converter that handles datetime64 types """

    def _timestamp_to_mysql(self, value):
        return value.view('<i8')

config = {
    'user'    : 'user',
    'host'    : 'localhost',
    'password': 'xxx',
    'database': 'db1'}

conn = mysql.connector.connect(**config)
conn.set_converter_class(Datetime64Converter)

由于所有 datetime64 dtypes 都是 8 字节,它们可以被查看和存储为 8 字节整数。我不确定 MySQL 连接器为将数据拉回为datetime64s 提供了哪些工具。但如果一切都失败了,您可以将 8 字节整数转换回datetime64[ns]如下所示:

In [33]: s.view('<i8')
Out[33]: 
0    978307200000000000
1    978393600000000000
2    978480000000000000
3    978566400000000000
4    978652800000000000
5    978739200000000000
6    978825600000000000
7    978912000000000000
8    978998400000000000
9    979084800000000000
dtype: int64

In [34]: s.view('<i8').view('<M8[ns]')
Out[34]: 
0   2001-01-01
1   2001-01-02
2   2001-01-03
3   2001-01-04
4   2001-01-05
5   2001-01-06
6   2001-01-07
7   2001-01-08
8   2001-01-09
9   2001-01-10
dtype: datetime64[ns]
于 2014-08-02T10:23:25.170 回答
1

我发现只需将日期重新格式化为字符串即可解决问题。

df['Date'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))

MySQL 仍然将其识别为日期而不是 VarChar。我在工作中遇到了完全相同的问题,并且该解决方案一直有效。

于 2015-07-08T16:47:25.607 回答