0

我正在尝试找到一种每天对数据进行分组的方法。

这是我的数据集的一个例子。

Dates                              Price1                                 Price 2

2002-10-15  11:17:03pm              0.6                                     5.0

2002-10-15  11:20:04pm              1.4                                     2.4

2002-10-15  11:22:12pm              4.1                                     9.1

2002-10-16  12:21:03pm              1.6                                     1.4

2002-10-16  12:22:03pm              7.7                                     3.7
4

3 回答 3

2

是的,我肯定会为此使用 Pandas。最棘手的部分只是找出 pandas 用于加载数据的日期时间解析器。之后,它只是对后续 DataFrame 的重新采样。

In [62]: parse = lambda x: datetime.datetime.strptime(x, '%Y-%m-%d %I:%M:%S%p')
In [63]: dframe = pandas.read_table("data.txt", delimiter=",", index_col=0, parse_dates=True, date_parser=parse)
In [64]: print dframe
                                 Price1                                   Price 2
Dates                                                                            
2002-10-15 23:17:03                                0.6                        5.0
2002-10-15 23:20:04                                1.4                        2.4
2002-10-15 23:22:12                                4.1                        9.1
2002-10-16 12:21:03                                1.6                        1.4
2002-10-16 12:22:03                                7.7                        3.7
In [78]: means = dframe.resample("D", how='mean', label='left')
In [79]: print means
                                 Price1                                   Price 2
Dates                                                                            
2002-10-15                                    2.033333                       5.50
2002-10-16                                    4.650000                       2.55

其中data.txt

Dates                 ,         Price1    ,                  Price 2
2002-10-15  11:17:03pm,          0.6      ,                    5.0
2002-10-15  11:20:04pm,          1.4      ,                    2.4
2002-10-15  11:22:12pm,          4.1      ,                    9.1
2002-10-16  12:21:03pm,          1.6      ,                    1.4
2002-10-16  12:22:03pm,          7.7      ,                    3.7
于 2012-10-19T14:55:17.147 回答
0

来自熊猫文档: http: //pandas.pydata.org/pandas-docs/stable/pandas.pdf

 # 72 hours starting with midnight Jan 1st, 2011
 In [1073]: rng = date_range(’1/1/2011’, periods=72, freq=’H’)
于 2012-10-19T13:21:40.837 回答
0

Use

data.groupby(data['dates'].map(lambda x: x.day))
于 2012-10-19T13:10:24.280 回答