1

我是 R 的初学者,我正在做一项模拟研究,我设法制作了一个我想做的样本。但是,我不知道应该如何复制我所做的。

这是我到目前为止编写的程序:

I <- 500       # number of observations
J <- 18        # total number of items
K <- 6         # number of testlets
JK <-3         # number of items within a testlet
response <- matrix(0, I, J)  # null binary (0, 1) response matrix 
unit <- matrix(1, JK, 1)     # unit vector

set.seed(1234)

# Multidimensional 3-pl model
pij <- function(a,b,c,theta,gamma) {c+(1-c)*(1/(1+exp(-1.7*a*(theta-b-gamma))))}

# Assigning a and b parameter values
a <- c(.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7)
b <-c(1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5)
# Assigning c-parameter, each 3 items (c-parameter & testlet effect)
#(small&small, small&large, large&small, large&large, mixed&small, mixed&large)
c <- c(.2,.2,.2,.2,.2,.2,.5,.5,.5,.5,.5,.5,.2,.33,.5,.2,.33,.5)    

theta <- rnorm(I, 0, 1)   # random sampling theta-values from normal dist. M=0, SD=1

gamma1 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma2 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma3 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma4 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma5 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma6 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1

# implementing that the testlet effect is same for the items within a testlet
gamma1T <- gamma1 %*% t(unit)
gamma2T <- gamma2 %*% t(unit)
gamma3T <- gamma3 %*% t(unit)
gamma4T <- gamma4 %*% t(unit)
gamma5T <- gamma5 %*% t(unit)
gamma6T <- gamma6 %*% t(unit)

gammaT <- matrix(c(gamma1T, gamma2T, gamma3T, gamma4T, gamma5T, gamma6T), I, J)  # getting all the gammas together in a large matrix

# Generating data using the information above
for(i in 1:I) {
  for(j in 1:J) {
    response[i, j] <- ifelse(pij(a=a[j], b=b[j], c=c[j], theta=theta[i], gamma=gammaT[i,j]) < runif(1), 0, 1)
  }
}

因此,我得到了一个数据集“响应”。我想要做的是复制它并获得 1000 个“响应”数据集。我认为这可以通过复制“theta”和“gamma”的随机抽样来完成,但我不知道实际这样做。

非常非常感谢提前,

韩乔。

4

2 回答 2

4

The advice of Stedy is reasonable, apart from one thing: DON'T increment the seed in the for loop.

As fas as I understand Stedy's suggestion, set.seed(i) would be called within the for loop for each simulation, with i being incremented in each iteration. This strategy is known to introduce (potentially large) bias due to correlations among the generated sequences.

Instead, set the seed once at the beginning, i.e. before the for loop. For example, you could use the current Unix time as a seed, or read one from a file with random numbers (e.g. from random.org). Also, make sure to store the seed with your results, e.g. print it to a log file. If you want to reproduce the exact results of a previous set of replications again, you just have to set the corresponding seed.

If you want others to be able to exactly replicate your results, you should also specify the R version (and maybe the operating system) you used (as the RNG implementations may vary).

As an aside, simulation replication is an embarassingly parallel task, i.e. you can easily execute the replications in parallel if you have a multi-core machine (e.g. with rparallel). In this case, however, extra care regarding random numbers is necessary (e.g. see this paper).

于 2012-02-22T11:59:34.850 回答
1

我会把局部变量变成一个函数。然后创建一个循环,调用该函数并在每次调用该函数时for()递增一循环的长度。set.seed()for()

于 2012-02-22T07:12:41.133 回答