r - 计算标记重新捕获模型的遭遇历史

Question

我正在尝试创建（在 R 中）在RMark中使用的遭遇历史；即，如果发生了相遇，则返回“1”，如果没有发生，则返回“0”。

样本数据：

zm <- structure(list(date.time = structure(c(1365905306, 1365919237, 
1365923863, 1365929487, 1365931725, 1365942003, 1365945361, 1366143204, 
1366159355, 1366159863, 1366164285, 1366202496, 1366224357, 1366238428, 
1366243685, 1366250254, 1366252570, 1366314236, 1366315282, 1366386242
), class = c("POSIXct", "POSIXt"), tzone = ""), station = c("M1", 
"M2", "M2", "M3", "M4", "M3", "M4", "M7", "L1", "M1", "M2", "M2", 
"L4", "M2", "M2", "M3", "M4", "M1", "M2", "M1"), code = c(10908, 
10908, 10897, 10908, 10908, 10897, 10897, 10908, 10908, 10914, 
10914, 10916, 10908, 10917, 10910, 10917, 10917, 10913, 10913, 
10896)), .Names = c("date.time", "station", "code"), row.names = c(5349L, 
51L, 60L, 7168L, 65L, 7178L, 70L, 6968L, 8647L, 5362L, 79L, 94L, 
9027L, 96L, 105L, 7200L, 114L, 5382L, 123L, 5388L), class = "data.frame")

可能的遭遇历史（检查是否发生遭遇的站）：

rec<- c("M1", "M2","M3","M4","M5","M6","M7")

重要的是遭遇历史输出是指rec上面的顺序。

所以对于每个code我想看看它是否在第一个站被检测到，"M1"如果是，那么返回一个'1'，然后看看它是否在第二个站被检测到"M2"，如果没有返回一个“0”；这最终将作为 0 和 1 的字符串结束。

我能够通过以下方式获取数据rec：

library("plyr")
zm2 <- ddply(zm, c("code"), function(df)
 data.frame(arrive=(df[which(df$station %in% rec),])))

但是我不确定如何按顺序运行它，rec然后返回“0”或“1”。

最终我想要一个 data.frame 输出结构如下：

ch       code
00101    1
00011    2

等等...

score 2 · Accepted Answer

table()确实是要走的路，跟着paste0()把表折叠成一个字符串。（感谢可重复的示例！）

rec <- sort(unique(zm$station))
cfun <- function(x) {
    tab <- with(x,table(factor(station,levels=rec)))
    data.frame(ch=paste0(as.numeric(tab),collapse=""))
}
library(plyr)
ddply(zm,"code",cfun)
##    code      ch
## 1 10896 0010000
## 2 10897 0001110
## 3 10908 1111111
## 4 10910 0001000
## 5 10913 0011000
## 6 10914 0011000
## 7 10916 0001000
## 8 10917 0001110

或如@alexis_laz 建议的那样：

tab2 <- with(zm,table(code,station))
ctab <- apply(tab2,1,paste0,collapse="")
data.frame(code=names(ctab),ch=ctab)

（代码列出了两次，一次作为行名，一次作为列名）。后一个版本可能会更快一些，以防您拥有非常大的数据集或需要执行数千次...

score 0 · Accepted Answer

以为我会提供另一种解决方案来创建遭遇历史记录，以防您想用不同的方法交叉检查结果：

## Begin
zm$code <- as.character(zm$code)
tag.list = as.character(unique(zm$code)) # create a vector of all tags (codes) detected
sta.list = as.character(unique(zm$station)) # make a vector of the station names

# create empty data frame for filling encounter history later
enc.hist = as.data.frame(matrix(rep(NA,(length(tag.list)*length(sta.list))),
                            length(tag.list), length(sta.list)))
colnames(enc.hist) = sta.list
rownames(enc.hist) = tag.list

# fill in data frame using a for-loop:
for (i in 1:length(sta.list))
{
  sub <- zm[zm$station == sta.list[i],] #subset datos down to just the station you're currently looping
  subtags <- unique(sub$code) #creates vector of tags present at that station
  enc.hist[,i] <- tag.list %in% subtags #fills in the column of enc.hist with True or False if that tag is seen or not
}
head(enc.hist) # you now have a matrix with TRUE (1)/FALSE (0) for encounters:

 M1   M2    M3    M4    M7    L1    L4
10908  TRUE TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
10897 FALSE TRUE  TRUE  TRUE FALSE FALSE FALSE
10914  TRUE TRUE FALSE FALSE FALSE FALSE FALSE
10916 FALSE TRUE FALSE FALSE FALSE FALSE FALSE
10917 FALSE TRUE  TRUE  TRUE FALSE FALSE FALSE
10910 FALSE TRUE FALSE FALSE FALSE FALSE FALSE

## Finally, use logical syntax to convert TRUE to '1' and FALSE to '0'
enc.hist[enc.hist==TRUE] <- 1
enc.hist[enc.hist==FALSE] <- 0
enc.hist

      M1 M2 M3 M4 M7 L1 L4
10908  1  1  1  1  1  1  1
10897  0  1  1  1  0  0  0
10914  1  1  0  0  0  0  0
10916  0  1  0  0  0  0  0
10917  0  1  1  1  0  0  0
10910  0  1  0  0  0  0  0
10913  1  1  0  0  0  0  0
10896  1  0  0  0  0  0  0

现在你可以使用@alexis_laz 的优秀代码enc.hist来折叠成 RMARK 的 .inp。

更冗长，但提供了一种替代方法（希望）也能正常工作并保留站点顺序，尽管如果您有数百万次检测，for-loop 肯定会减慢您的速度。

r - 计算标记重新捕获模型的遭遇历史

2 回答 2

Related

Reference