r - R中的向量化IF语句？

Question

x <- seq(0.1,10,0.1)
y <- if (x < 5) 1 else 2

我希望if对每个案例进行操作，而不是对整个向量进行操作。我必须改变什么？

score 62 · Accepted Answer

x <- seq(0.1,10,0.1)

> x
  [1]  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0  1.1  1.2  1.3  1.4  1.5
 [16]  1.6  1.7  1.8  1.9  2.0  2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9  3.0
 [31]  3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.0  4.1  4.2  4.3  4.4  4.5
 [46]  4.6  4.7  4.8  4.9  5.0  5.1  5.2  5.3  5.4  5.5  5.6  5.7  5.8  5.9  6.0
 [61]  6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9  7.0  7.1  7.2  7.3  7.4  7.5
 [76]  7.6  7.7  7.8  7.9  8.0  8.1  8.2  8.3  8.4  8.5  8.6  8.7  8.8  8.9  9.0
 [91]  9.1  9.2  9.3  9.4  9.5  9.6  9.7  9.8  9.9 10.0

> ifelse(x < 5, 1, 2)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

score 15 · Accepted Answer

y <- if (x < 5) 1 else 2不对整个向量进行操作（您收到的警告告诉您只使用条件的第一个元素）。你想要ifelse：

y <- ifelse(x < 5, 1, 2)

ifelse逐个元素地对整个逻辑向量进行操作。 if只接受一个逻辑值。见?"if"和?ifelse

score 15 · Accepted Answer

为了完整性：在大向量中，您可以使用索引来加快速度（我们经常在模拟中这样做，其中函数通常运行 1000 到 10000 次）。但只要不是必需的，只需使用ifelse. 这读起来容易得多。

> set.seed(100)
> x <- runif(1000,1,10)

> system.time(replicate(10000,{
+     y <- ifelse(x < 5,1,2)
+ }))
   user  system elapsed 
   2.56    0.08    2.64 

> system.time(replicate(10000,{
+   y <- rep(2,length(x))
+   y[x < 5]<- 1
+ }))
   user  system elapsed 
   0.48    0.00    0.48

score 3 · Accepted Answer

您也可以只创建一个逻辑向量和 1 给它

x <- seq(0.1, 10, 0.1) # Your data set   
(x >= 5) + 1
#  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
# [92] 2 2 2 2 2 2 2 2 2

如果想比较性能，这将是最快的解决方案

set.seed(100)
x <- runif(1e6, 1, 10)

RL <- function(x) y <- ifelse(x < 5,1,2)
JM <- function(x) {y <- rep(2, length(x)); y[x < 5] <- 1}
DA <- function(x) y <- (x >= 5) + 1

library(microbenchmark)
microbenchmark(RL(x),
               JM(x),
               DA(x))

# Unit: milliseconds
#  expr       min        lq      mean    median        uq       max neval
# RL(x) 331.83448 366.52940 378.89182 374.99741 381.08659 609.21218   100
# JM(x)  38.72894  42.18745  44.36493  43.25086  44.09626  82.76168   100
# DA(x)  10.01644  11.96482  14.21593  13.17825  14.12930  53.76923   100

score 0 · Accepted Answer

按照上面的帖子，您甚至可以使用和修改满足条件的向量的元素。在我看来，如果更快地计算成本不是更高，那么应该总是这样做。

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*2

上一篇文章的代码最好能回答这个问题。但如果我不得不使用上面的代码，我会这样做：

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*0 +1

score 0 · Accepted Answer

nzMean <- function(x) { mean(x[x!=-1],na.rm=TRUE)}

nzMin <- function(x) {min(x[x!=-1],na.rm=TRUE)}

nzMax <- function(x) { max(x[x!=-1],na.rm=TRUE)}

nzRange<-function(x) {nzMax(x)-nzMin(x)}

nzSD <- function(x) { SD(x[x!=-1],na.rm=TRUE)}

#following function works
nzN1<- function(x) {ifelse(x!=-1,(x-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns only 4 not 5 elements of vector
nzN2<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns 5 elements of vector but not correct answer
nzN3<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,-1) }

y<-c(1,-1,-20,2,4)
a<-nzMean(y)
b<-nzMin(y)
c<-nzMax(y)
d<-nzRange(y)
# test the working function
z<-nzN1(y)

print(z)

r - R中的向量化IF语句？

6 回答 6

Related

Reference