0

我一天有以下数据框,其中每一行是一分钟。:

stock,date,open,high,low,close,volume
AACG,202005010928,0.73,0.73,0.73,0.73,200
AACG,202005010929,0.73,0.73,0.73,0.73,100
AACG,202005010930,0.8,0.8,0.8,0.8,1250
AACG,202005010934,0.72,0.72,0.72,0.72,100
AACG,202005010937,0.71,0.71,0.68,0.68,3599
AACG,202005010938,0.65,0.65,0.65,0.65,2200
AACG,202005010947,0.73,0.73,0.73,0.73,125
AACG,202005010955,0.71,0.71,0.71,0.71,300
AACG,202005011002,0.7,0.7,0.7,0.7,10818
AACG,202005011112,0.73,0.73,0.73,0.73,100
AACG,202005011125,0.7,0.7,0.7,0.7,1103
AACG,202005011153,0.7,0.7,0.66,0.66,3334
AACG,202005011223,0.7,0.7,0.7,0.7,100
AACG,202005011234,0.73,0.73,0.73,0.73,250
AACG,202005011258,0.71,0.71,0.71,0.71,100
AACG,202005011321,0.73,0.73,0.72,0.72,1200
AACG,202005011329,0.7,0.7,0.7,0.7,4200
AACG,202005011427,0.73,0.73,0.73,0.73,100
AACG,202005011432,0.65,0.65,0.65,0.65,369
AACG,202005011529,0.66,0.66,0.66,0.66,254
AACG,202005011544,0.73,0.73,0.7,0.73,1397
AACG,202005011545,0.74,0.74,0.74,0.74,100
AACG,202005011548,0.73,0.73,0.73,0.73,100
AACG,202005011549,0.74,0.74,0.74,0.74,100
AAL,202005010900,11.29,11.3,11.29,11.29,8201
AAL,202005010901,11.28,11.31,11.26,11.28,26935
AAL,202005010902,11.3,11.34,11.3,11.33,31958
AAL,202005010903,11.33,11.36,11.31,11.35,44487
AAL,202005010904,11.35,11.35,11.32,11.33,22240

我想使用矢量方法(不是iterrows因为它需要时间)为每一行添加一个计算迄今为止每日交易量的列。我怎样才能做到这一点?谢谢

4

2 回答 2

0

每天按股票发行创建一个新的累积列。

df['date'] = pd.to_datetime(df['date'], format='%Y%m%d%H%M')
df['cumsum'] = df.groupby(['stock', df['date'].dt.date])['volume'].cumsum()

df
    stock   date    open    high    low close   volume  cumsum
0   AACG    2020-05-01 09:28:00 0.73    0.73    0.73    0.73    200 200
1   AACG    2020-05-01 09:29:00 0.73    0.73    0.73    0.73    100 300
2   AACG    2020-05-01 09:30:00 0.80    0.80    0.80    0.80    1250    1550
3   AACG    2020-05-01 09:34:00 0.72    0.72    0.72    0.72    100 1650
4   AACG    2020-05-01 09:37:00 0.71    0.71    0.68    0.68    3599    5249
5   AACG    2020-05-01 09:38:00 0.65    0.65    0.65    0.65    2200    7449
6   AACG    2020-05-01 09:47:00 0.73    0.73    0.73    0.73    125 7574
7   AACG    2020-05-01 09:55:00 0.71    0.71    0.71    0.71    300 7874
8   AACG    2020-05-01 10:02:00 0.70    0.70    0.70    0.70    10818   18692
9   AACG    2020-05-01 11:12:00 0.73    0.73    0.73    0.73    100 18792
10  AACG    2020-05-01 11:25:00 0.70    0.70    0.70    0.70    1103    19895
11  AACG    2020-05-01 11:53:00 0.70    0.70    0.66    0.66    3334    23229
12  AACG    2020-05-01 12:23:00 0.70    0.70    0.70    0.70    100 23329
13  AACG    2020-05-01 12:34:00 0.73    0.73    0.73    0.73    250 23579
14  AACG    2020-05-01 12:58:00 0.71    0.71    0.71    0.71    100 23679
15  AACG    2020-05-01 13:21:00 0.73    0.73    0.72    0.72    1200    24879
16  AACG    2020-05-01 13:29:00 0.70    0.70    0.70    0.70    4200    29079
17  AACG    2020-05-01 14:27:00 0.73    0.73    0.73    0.73    100 29179
18  AACG    2020-05-01 14:32:00 0.65    0.65    0.65    0.65    369 29548
19  AACG    2020-05-01 15:29:00 0.66    0.66    0.66    0.66    254 29802
20  AACG    2020-05-01 15:44:00 0.73    0.73    0.70    0.73    1397    31199
21  AACG    2020-05-01 15:45:00 0.74    0.74    0.74    0.74    100 31299
22  AACG    2020-05-01 15:48:00 0.73    0.73    0.73    0.73    100 31399
23  AACG    2020-05-01 15:49:00 0.74    0.74    0.74    0.74    100 31499
24  AAL 2020-05-01 09:00:00 11.29   11.30   11.29   11.29   8201    8201
25  AAL 2020-05-01 09:01:00 11.28   11.31   11.26   11.28   26935   35136
26  AAL 2020-05-01 09:02:00 11.30   11.34   11.30   11.33   31958   67094
27  AAL 2020-05-01 09:03:00 11.33   11.36   11.31   11.35   44487   111581
28  AAL 2020-05-01 09:04:00 11.35   11.35   11.32   11.33   22240   133821
于 2020-06-19T03:51:33.630 回答
0
df['cumsum'] = df['volume'].cumsum()

于 2020-06-18T21:25:59.503 回答