0

我仍在处理几天前的一个问题,并希望收到有关如何创建函数的反馈/支持。您的专业知识受到高度赞赏。

我创建了以下内容:

##### 1)
> raceIDs
[1] "GER" "SUI" "NZ2" "US1" "US2" "POR" "FRA" "AUS" "NZ1" "SWE"

##### 2)
#For each "raceIDs", there is a csv file which I have made a loop to read and created a list of data frames (assigned to the symbol "boatList")
#For example, if I select "NZ1" the output is:
> head(boatList[[9]]) #Only selected the first six lines as there is more than 30000 rows
  Boat       Date    Secs    LocalTime   SOG
1  NZ1 01:09:2013 38150.0 10:35:49.997 22.17
2  NZ1 01:09:2013 38150.2 10:35:50.197 22.19
3  NZ1 01:09:2013 38150.4 10:35:50.397 22.02
4  NZ1 01:09:2013 38150.6 10:35:50.597 21.90
5  NZ1 01:09:2013 38150.8 10:35:50.797 21.84
6  NZ1 01:09:2013 38151.0 10:35:50.997 21.95

##### 3)
# A matrix showing the race times for each raceIDs
> raceTimes
    start      finish    
GER "11:10:02" "11:35:05"
SUI "11:10:02" "11:35:22"
NZ2 "11:10:02" "11:34:12"
US1 "11:10:01" "11:33:29"
US2 "11:10:01" "11:36:05"
POR "11:10:02" "11:34:31"
FRA "11:10:02" "11:34:45"
AUS "11:10:03" "11:36:48"
NZ1 "11:10:01" "11:35:16"
SWE "11:10:03" "11:35:08"

meanRaceSpeed我需要做的是我需要通过创建一个名为并具有三个参数的函数来计算一艘船“在比赛时”(在开始和结束时间之间)的平均速度(SOG):

到目前为止,我尝试的是创建一个带有 3 个参数的函数(这里有一些专家的帮助):

meanRaceSpeed <- function(raceIDs, boatList, raceTimes)  
 {
  #Probably need to compare times, and thought it might be useful to convert character values into `DateTime` values but not to sure how to use it
  #DateTime <- as.POSIXct(paste(boatList$Date, boatList$Time), format="%Y%m%d %H%M%S")

  #To get the times for each boat
  start_time <- raceTimes$start[rownames(raceTimes) = raceIDs] 
  finish_time <- raceTimes$finish[rownames(raceTimes) = raceIDs]
  start_LocalTime <- min(grep(start_time, boatList$LocalTime))
  finish_LocalTime <- max(grep(finish_time, boatList$LocalTime))

  #which `SOG`s contain all the `LocalTimes` between start and finish
  #take their `mean`
  mean(boatList$SOG[start_LocalTime : finish_LocalTime])
 }
 ### Obviously, my code does not work :( and I don't know where.

所以基本上,我需要创建一个带有三个参数的函数,预期的结果是:

#e.g For NZ1
> meanRaceSpeed("NZ1", boatList, raceTimes)   
[1] 18.32   #Mean speed for NZ1 between 11:10:01 - 11:35:16

#e.g for US1
> meanRaceSpeed("US1", boatList, raceTimes)
[1] 17.23    #Mean speed for US1 between 11:10:01 - 11:33:29

在我可能出错的地方有什么帮助吗?非常感谢您的帮助。

4

2 回答 2

0

我将为 R 提供一些一般性建议,但我也会帮助您解决您的具体问题。每当我在 R 中遇到问题时,我通常会发现它有助于使事情更明确。

如果函数不能使用这些方法(是函数中的数据框还是矩阵?),那么您应该尝试另一种方法。如果这些表格操作方法不起作用,请尝试其他方法。如何?

这里有一些你可以做的不同的事情来测试你的功能,以及一些可能会让你前进的建议。(我不想为你解决整个问题,因为这是你的家庭作业,而是让你上路。)

1)为什么不尝试使用循环而不是括号?

start_time <- raceTimes$start[rownames(raceTimes) = raceIDs]

把它变成一个for循环。做起来并不难。

2)调试你的功能。R 内置了很多工具可以做到这一点,并且可以在包中添加。因为你很可能没有时间做作业。我建议这样做。拆开函数并将其每个部分与您想要的变量一起应用。它们的长度合适吗?它们是正确的数据类型吗?在你把它们放在一起之前,它们是否得到了正确的答案?确保这一点。

3)如果一切都失败了,不要害怕如果函数和代码不优雅。R 并不总是一种优雅的语言。(实际上,它很少是一种优雅的语言。)尤其是当您是初学者时,您的代码可能会很丑陋。只要确保它有效。

于 2013-10-25T06:53:39.360 回答
0

由于我已经对您的数据有经验,所以我坐下来做一个完整的例子。

首先,看起来像你的数据:

raceIDs <- c("GER", "SUI", "NZ2", "US1", "US2", "POR", "FRA", "AUS", "NZ1", "SWE")

raceTimes <- as.matrix(read.table(text = '    start      finish    
GER "11:10:02" "11:35:05"
SUI "11:10:02" "11:35:22"
NZ2 "11:10:02" "11:34:12"
US1 "11:10:01" "11:33:29"
US2 "11:10:01" "11:36:05"
POR "11:10:02" "11:34:31"
FRA "11:10:02" "11:34:45"
AUS "11:10:03" "11:36:48"
NZ1 "11:10:01" "11:35:16"
SWE "11:10:03" "11:35:08"', header = T))

#turn matrix to data.frame or, else, `$` won't work
raceTimes <- as.data.frame(raceTimes, stringsAsFactors = F)

blDF <- data.frame(Boat = rep(raceIDs, 3), 
  LocalTime = c(raceTimes$start, rep("11:20:25", length(raceIDs)), raceTimes$finish),
       SOG = runif(3 * length(raceIDs), 15, 25), stringsAsFactors = F)

boatList <- split(blDF, blDF$Boat)
#remove `names` to create them from scratch
names(boatList) <- NULL

然后:

#create `names` by searching each element of 
#`boatList` of what `boat` it contains
names(boatList) <- unlist(lapply(boatList, function(x) unique(x$Boat)))

#the function
meanRaceSpeed <- function(ID, boatList, raceTimes)  
 {              #named the first argument `ID` instead of `raceIDs`
  start_time <- raceTimes$start[rownames(raceTimes) == ID] 
  finish_time <- raceTimes$finish[rownames(raceTimes) == ID]

  start_LocalTime <- min(grep(start_time, boatList[[ID]]$LocalTime))
  finish_LocalTime <- max(grep(finish_time, boatList[[ID]]$LocalTime))

  mean(boatList[[ID]]$SOG[start_LocalTime : finish_LocalTime])
 }

测试:

  meanRaceSpeed("US1", boatList, raceTimes)
#[1] 19.7063
  meanRaceSpeed("NZ1", boatList, raceTimes)
#[1] 21.74729
  mean(boatList$NZ1$SOG) #to test function
#[1] 21.74729
  mean(boatList$US1$SOG) #to test function 
#[1] 19.7063
于 2013-10-25T08:24:34.470 回答