-2

我每天都会发布一份excel报告,我需要总结并提供趋势分析。此报告具有创建日期、工作项类型的工作项列表。如何获取 2011 年、2012 年创建的工作项的计数?另外,如何按工作项类型获取计数?到目前为止,我已经能够通过执行以下操作来加载 excel 数据并获取行数 -

library(gdata)
wi20121812 = read.xls("WorkItemReport20121812.xls")
nrow(wi20121812)

样本数据

   > dput(head(workItemReport2))
structure(list(DocType = structure(c(6L, 7L, 6L, 6L, 8L, 6L), .Label = c("TYPE10WI", 
"TYPE11WI", "TYPE12WI", "TYPE13WI", "TYPE14WI", "TYPE1WI", "TYPE2WI", 
"TYPE3WI", "TYPE4WI", "TYPE5WI", "TYPE6WI", "TYPE7WI", "TYPE8WI", 
"TYPE9WI"), class = "factor"), CreatedDate = structure(c(7L, 
22L, 146L, 181L, 153L, 191L), .Label = c("1/10/12 15:43 AM/PM ", 
"1/10/12 16:06 AM/PM ", "1/10/12 5:28 AM/PM ", "1/10/12 5:56 AM/PM ", 
"1/11/12 19:51 AM/PM ", "1/11/12 5:26 AM/PM ", "1/12/11 21:58 AM/PM ", 
"1/12/12 11:08 AM/PM ", "1/12/12 5:41 AM/PM ", "1/12/12 9:56 AM/PM ", 
"1/13/12 14:01 AM/PM ", "1/13/12 15:08 AM/PM ", "1/13/12 15:11 AM/PM ", 
"1/13/12 8:51 AM/PM ", "1/16/12 10:27 AM/PM ", "1/16/12 10:28 AM/PM ", 
"1/16/12 16:37 AM/PM ", "1/16/12 7:52 AM/PM ", "1/18/12 15:02 AM/PM ", 
"1/18/12 16:03 AM/PM ", "1/18/12 16:13 AM/PM ", "1/19/11 19:23 AM/PM ", 
"1/20/12 10:48 AM/PM ", "1/20/12 12:23 AM/PM ", "1/20/12 8:38 AM/PM ", 
"1/23/12 5:53 AM/PM ", "1/24/12 15:18 AM/PM ", "1/24/12 8:23 AM/PM ", 
"1/24/12 8:58 AM/PM ", "1/25/12 11:38 AM/PM ", "1/25/12 5:28 AM/PM ", 
"1/26/12 13:48 AM/PM ", "1/26/12 15:53 AM/PM ", "1/26/12 15:58 AM/PM ", 
"1/26/12 16:13 AM/PM ", "1/26/12 16:18 AM/PM ", "1/26/12 7:33 AM/PM ", 
"1/27/12 7:48 AM/PM ", "1/3/12 17:48 AM/PM ", "1/3/12 18:33 AM/PM ", 
"1/3/12 9:07 AM/PM ", "1/30/12 11:22 AM/PM ", "1/30/12 22:52 AM/PM ", 
"1/30/12 23:10 AM/PM ", "1/31/12 19:54 AM/PM ", "1/31/12 20:39 AM/PM ", 
"1/31/12 5:42 AM/PM ", "1/31/12 9:42 AM/PM ", "1/4/12 14:02 AM/PM ", 
"1/4/12 9:52 AM/PM ", "1/5/12 13:42 AM/PM ", "1/5/12 17:42 AM/PM ", 
....
....
"9/6/12 9:02 AM/PM ", "9/7/12 11:48 AM/PM ", "9/7/12 12:58 AM/PM ", 
"9/7/12 13:52 AM/PM ", "9/7/12 15:07 AM/PM ", "9/7/12 15:12 AM/PM ", 
"9/7/12 15:22 AM/PM ", "9/7/12 15:47 AM/PM ", "9/7/12 15:52 AM/PM ", 
"9/7/12 8:42 AM/PM ", "9/7/12 9:32 AM/PM ", "9/8/11 23:43 AM/PM "
), class = "factor")), .Names = c("DocType", "CreatedDate"), row.names = c(NA, 
6L), class = "data.frame")
> 
4

2 回答 2

1

您问题的一部分仍未得到解答,“如何获得工作项类型的计数”非常简单。

res <- table(wi20121812[, "WorkItemType"])

这将为您提供一个简单的表格,告诉您每个 WorkItemType 发生的频率。如果您需要按比例而不是绝对计数,请在结果上运行 prop.table() :

prop.table(res)

或者同时做这两个:

res <- prop.table(table(wi20121812[, "WorkItemType"]))
于 2012-12-30T16:13:21.580 回答
0

您可以ddplyplyr包中使用:

res = ddply(df, "year", summarise, amount = length(year))

或使用count相同的包(这更容易):

res = count(df, "year")

其中dfadata.frame包含您的数据,并且year是包含分类变量的列的列名,该分类变量详细说明该行是在哪一年创建的。

于 2012-12-18T15:08:04.653 回答