我有一个看起来像这样的 CSV 数据..
CUSIP BuyDate SellDate BuyAmount SellAmount Profit DaysHolding Over365Days
037833100 12/1/2015 3/1/2017 45 27 -18 456 1
17275R102 1/28/2016 2/21/2017 28 25 -3 390 1
38259P508 10/29/2015 2/18/2017 39 36 -3 478 1
594918104 3/1/2016 3/2/2017 35 40 5 366 1
68389X105 4/14/2016 2/21/2017 47 37 -10 313 0
037833100 12/11/2015 2/19/2017 46 40 -6 436 1
17275R102 1/12/2016 2/24/2017 29 34 5 409 1
38259P508 12/22/2015 2/20/2017 46 39 -7 426 1
594918104 12/19/2015 2/22/2017 26 36 10 431 1
68389X105 2/13/2016 3/2/2017 33 34 1 383 1
037833100 12/9/2015 2/18/2017 32 37 5 437 1
17275R102 2/13/2016 2/27/2017 48 25 -23 380 1
38259P508 11/30/2015 2/23/2017 45 34 -11 451 1
594918104 11/14/2015 2/27/2017 47 28 -19 471 1
68389X105 2/10/2016 2/17/2017 39 38 -1 373 1
037833100 4/7/2016 3/5/2017 44 29 -15 332 0
17275R102 3/3/2016 2/19/2017 26 36 10 353 0
037833100 11/25/2015 2/17/2017 28 40 12 450 1
037833100 1/10/2016 3/6/2017 35 36 1 421 1
037833100 3/4/2016 2/22/2017 45 25 -20 355 0
38259P508 2/10/2016 3/7/2017 42 40 -2 391 1
38259P509 12/5/2015 2/25/2017 31 39 8 448 1
38259P510 4/7/2016 2/27/2017 27 34 7 326 0
38259P511 3/26/2016 2/17/2017 27 39 12 328 0
17275R102 2/11/2016 2/27/2017 29 39 10 382 1
17275R102 11/24/2015 2/18/2017 45 35 -10 452 1
38259P509 3/29/2016 3/7/2017 46 27 -19 343 0
38259P509 4/5/2016 2/23/2017 38 38 0 324 0
17275R102 2/13/2016 2/26/2017 35 31 -4 379 1
594918104 3/10/2016 3/4/2017 29 28 -1 359 0
17275R102 10/30/2015 2/23/2017 40 30 -10 482 1
17275R102 12/15/2015 3/2/2017 25 38 13 443 1
594918104 2/2/2016 2/22/2017 26 32 6 386 1
594918105 3/8/2016 2/20/2017 26 29 3 349 0
594918106 11/21/2015 3/6/2017 44 38 -6 471 1
594918107 3/21/2016 2/20/2017 48 39 -9 336 0
594918108 12/21/2015 3/5/2017 37 28 -9 440 1
594918109 1/16/2016 3/5/2017 35 33 -2 414 1
594918110 2/8/2016 3/2/2017 41 39 -2 388 1
此文件中有数百万行。我想根据 CUSIP 对所有交易进行排序,然后根据利润和 Over365Days 对结果进行小计。这是最终结果应该是什么样子的图像。我只是为效果添加了一些颜色。
我猜它应该是这样的:
# read csv file
mydata = read.csv("AllTrades.csv")
# sort by CUSIP, Over365Days
sortdata <- mtcars[order(CUSIP, Over365Days),]
# aggregate by Profit & 365Days
finalresults <- aggregate(cbind(Profit, Over365Days) ~ CUSIP, data = sortdata, FUN = sum)
我可以在 Excel 中轻松管理小型数据集,但同样需要处理数百万行。有人可以给我一些可以做我描述的示例代码吗?谢谢大家。