0

我正在尝试进行批量正态分布测试。

我的数据看起来像:

"Date","Department","Discipline","Employee ID","SumOfBillable Hrs"
"10/09/2012","D","B",50084.00,8.00
"10/09/2012","D","C",51870.00,10.00
"10/09/2012","D","E",50216.00,10.00
"10/09/2012","D","E",53422.00,9.00
"10/09/2012","D","E",53765.00,10.00
"14/01/2013","E","Y",53146.00,9.00
"14/01/2013","E","Y",53202.00,9.00
"14/01/2013","E","Y",54470.00,9.00
"14/01/2013","SITE","0",54525.00,9.00
"14/02/2013","D","C",51870.00,10.00
"14/02/2013","D","E",50029.00,8.50
"14/02/2013","D","E",50216.00,9.00
"14/02/2013","D","E",53422.00,4.00

我想检查每个Employee_ID.

有批处理方法吗?我有80多IDs。因此,单独获取每个ID并为其绘制/创建描述性统计数据将相当乏味。

谢谢

4

1 回答 1

1

你可以从这样的事情开始。如果你想要一些不同的东西,你必须提供更多关于你想用它做什么的信息。

data <- read.table(header=T, sep=",", 
 text='"Date","Department","Discipline","Employee ID","SumOfBillable Hrs"
"10/09/2012","D","B",50084.00,8.00
"10/09/2012","D","C",51870.00,10.00
"10/09/2012","D","E",50216.00,10.00
"10/09/2012","D","E",53422.00,9.00
"10/09/2012","D","E",53765.00,10.00
"14/01/2013","E","Y",53146.00,9.00
"14/01/2013","E","Y",53202.00,9.00
"14/01/2013","E","Y",54470.00,9.00
"14/01/2013","SITE","0",54525.00,9.00
"14/02/2013","D","C",51870.00,10.00
"14/02/2013","D","E",50029.00,8.50
"14/02/2013","D","E",50216.00,9.00
"14/02/2013","D","E",53422.00,4.00')



# Means:
aggregate(SumOfBillable.Hrs ~ Employee.ID, data=data, FUN=mean)

# Standard Deviations:
aggregate(SumOfBillable.Hrs ~ Employee.ID, data=data, FUN=sd)

# Or a Shapiro normality test: (only works if you have more than 3 observations per Employee.ID
aggregate(SumOfBillable.Hrs ~ Employee.ID, data=data, FUN=shapiro.test)
于 2013-02-27T08:49:02.660 回答