r - 添加向量后从现有矩阵中随机选择值（在 R 中）

Question

非常感谢您提前提供的帮助！

我正在尝试修改现有矩阵，以便在向矩阵添加新行时，它会从先前存在的矩阵中删除值。

例如，我有矩阵：

[,1] [,2] [,3] [,4]
 1     1    0    0
 0     1    0    0
 1     0    1    0
 0     0    1    1

我想添加另一个向量 I.vec，它有两个值 ( I.vec=c(0,1,1,0))。这很容易做到。我只是将它绑定到矩阵。现在，对于 I.vec 等于 1 的每一列，我想从其他行中随机选择一个值并将其设为零。理想情况下，这最终会得到一个矩阵，如：

[,1] [,2] [,3] [,4]
 1     0    0    0
 0     1    0    0
 1     0    0    0
 0     0    1    1
 0     1    1    0

但是每次我运行迭代时，我都希望它再次随机采样。

所以这就是我尝试过的：

mat1<-matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1),byrow=T, nrow=4)
I.vec<-c(0,1,1,0)
mat.I<-rbind(mat1,I.vec)
mat.I.r<-mat.I
d1<-mat.I[,which(mat.I[5,]==1)]
mat.I.r[sample(which(d1[1:4]==1),1),which(mat.I[5,]==1)]<-0

但这只会删除我要删除的两个值之一。我也尝试过对矩阵进行子集化的变体，但没有成功。

再次感谢你！

score 5 · Accepted Answer

OP的描述有一点模棱两可，因此建议了两种解决方案：

假设只有`1`相关列中存在的 s 可以设置为`0`

我将更改原始功能（见下文）。更改是对定义的行rows。我现在有（原来有一个错误 - 下面的版本被修改以处理这个错误）：

rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                       }
                       out
                   }, mat = mat, cols = cols)

基本上，它的作用是，对于需要将 a 交换1为 a的每一列0，我们计算出该列的哪些行包含1s 并从中采样。

编辑1：我们必须处理一列中只有一个的情况。如果我们只是从长度为 1 的向量中采样，R'ssample()会将其视为我们想要从集合中采样而seq_len(n)不是从长度为 1 的集合中采样n。我们现在用一个if, else声明来处理这个问题。

我们必须为每一列单独执行此操作，以便获得正确的行。which()我想我们可以做一些很好的操作来避免对and的重复调用sample()，但是我现在如何逃避，因为我们必须处理1列中只有一个的情况。这是完成的函数（已更新以处理原始长度为 1 的示例错误）：

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                              }
                       out
                   }, mat = mat, cols = cols)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

它在行动中：

> set.seed(2)
> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

1当我们想要进行交换的列中只有一个时，它可以工作：

> foo(mat1, c(0,0,1,1))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    0    1    1

原始答案：假设相关列中的任何值都可以设置为零

这是一个向量化的答案，我们在进行替换时将矩阵视为向量。使用示例数据：

mat1 <- matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1), byrow = TRUE, nrow = 4)
ivec <- c(0,1,1,0)

## Set a seed to make reproducible
set.seed(2)

## number of rows and columns of our matrix
nr <- nrow(mat1)
nc <- ncol(mat1)

## which of ivec are 1L
cols <- which(ivec == 1L)

## sample length(cols) row indices, with replacement
## so same row can be drawn more than once
rows <- sample(seq_len(nr), length(cols), replace = TRUE)

## Compute the index of each rows cols combination
## if we treated mat1 as a vector
ind <- (nr*(cols-1)) + rows
## ind should be of length length(cols)

## copy for illustration
mat2 <- mat1

## replace the indices we want with 0, note sub-setting as a vector
mat2[ind] <- 0

## bind on ivec
mat2 <- rbind(mat2, ivec)

这给了我们：

> mat2
     [,1] [,2] [,3] [,4]
        1    0    0    0
        0    1    0    0
        1    0    0    0
        0    0    1    1
ivec    0    1    1    0

如果我不止一次或两次这样做，我会将其包装在一个函数中：

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sample(seq_len(nr), length(cols), replace = TRUE)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

这使：

> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

If you wanted to do this for multiple ivecs, growing mat1 each time, then you probably don't want to do that in a loop as growing objects is slow (it involves copies etc). But you could just modify the definition of ind to include the extra n rows you bind on for the n ivecs.

score 1 · Accepted Answer

你可以试试这样的。在那里有“nrow”将允许您与其他“I.vec”一起多次运行它。我尝试使用“应用”在一行中执行此操作，但无法再次出现矩阵。

mat1<-matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1),byrow=T, nrow=4)
I.vec<-c(0,1,1,0)
mat.I.r<-rbind(mat1,I.vec)

for(i in 1:ncol(mat.I.r))
  {
  ifelse(mat.I.r[nrow(mat.I.r),i]==1, mat.I.r[sample(which(mat.I.r[1:(nrow(mat.I.r)-1),i]==1),1), i] <- 0, "")
  }
mat.I.r

r - 添加向量后从现有矩阵中随机选择值（在 R 中）

2 回答 2

假设只有1相关列中存在的 s 可以设置为0

原始答案：假设相关列中的任何值都可以设置为零

Related

Reference

假设只有`1`相关列中存在的 s 可以设置为`0`