3

I'm having a little R situation that I can't get my head around.

Supposedly the code for this should only take two or 3 lines

What I have to do is figure out how many samples of 10 variables I have to take before I have seen every number once.

In other words, how many rolls of a dice (in this case with 10 sides) it takes before I have seen every side

so far I have something along these lines

    param<-1:10
    count<-0
    seen<-0
    for (i in 1:10) {
      if (sample(param, size=1)==i);
        if i in seen;
          count+=1
          seen+=1
        elif count+=1
    when seen==10 return(count) 
    }

but this is waaay too long, also I know the formatting of it isn't right (pretty sure I'm trying to use python code at some points) but this is the first time I have done a loop in R.

Any help would be much appreciated!

yes, this is for a project but I can't think of anything else. yes, i have tried looking at other questions/answers for help but my brain is just in a muddle now

4

2 回答 2

3

如果您正在寻找更短的代码:

set.seed(1984)

n = 10
param = 1:n
count = 0

while(length(param) != 0){             ## stop when all numbers are seen
 param = setdiff(param, sample(1:n,1)) ## remove the element
 count = count + 1
}

count

## 28

编辑(稍微矢量化的方法)

set.seed(1984)

n = 10
param = 1:n
count = 0

while(length(param) != 0){
  count = count + length(param)
  param = setdiff(param, sample(1:n,length(param),replace=T))
}

count

## 28

编辑 2(多次运行)

set.seed(1984)

n = 10
num.runs = 5
count = rep(0,5)

for(i in 1:num.runs)
{
  param = 1:n
  while(length(param) != 0){
    count[i] = count[i] + length(param)
    param = setdiff(param, sample(1:n,length(param),replace=T))
  }
}

count

## 28 24 23 30 23
于 2013-05-09T06:09:49.343 回答
3

您可以通过故意过采样来矢量化这个问题。在这个例子中,我创建了一个长度为 1000 的采样向量,然后使用它sapply来找到解决方案:

编辑,使用match而不是sapply,如@MadScone 所建议的那样

set.seed(1984)
n <- 10

x <- sample(n, 1e3, replace=TRUE)
max(match(1:n, x))
[1] 28

如果你想重复实验,你可以使用replicate

do_experiment = function() {
    n <- 10
    x <- sample(n, 1e3, replace=TRUE)
    return(max(match(1:n, x)))
}
replicate(100, do_experiment())
 [1] 28 26 26 15 30 14 29 18 35 24 24 35 42 20 29 18 18 38 14 22 26 26 22 29 31
 [26] 51 14 35 26 19 40 22 23 19 28 15 27 20 16 18 20 19 18 37 24 38 37 54 29 19
 [51] 22 22 14 17 33 22 35 15 32 23 35 27 22 18 30 31 38 36 26 31 43 27 23 21 40
 [76] 25 36 21 39 27 55 28 36 15 48 31 32 46 28 21 40 23 46 24 31 30 25 21 24 20
于 2013-05-09T08:38:35.870 回答