3

我对 R 比较陌生,如果这个问题太基本,请原谅。我想知道是否有一种使用 R 创建完整拨号盘的好方法和快速方法?

我有一个看起来像的矩阵:

          M1 M2 M3
   Line1  A  B  A
   Line2  A  A  B
   Line3  B  A  A

从这个矩阵我想创建以下数据框:

 X       Y       M1   M2  M3
 Line1   Line1   AA   BB  AA
 Line1   Line2   AA   BA  AB
 Line1   Line3   AB   BA  AA
 Line2   Line1   AA   AB  BA
 Line2   Line2   AA   AA  BB
 Line2   Line3   AB   AA  BA
 Line3  Line1    BA   AB  AA
 Line3  Line2    BA   AA  AB
 Line3  Line3    BB   AA  AA

我认为这可以通过创建几个嵌套循环并使用粘贴来组合 A 和 B 字母代码来实现。但可能有更好和更多“R-like”选项(使用cbind()?)。

4

2 回答 2

2

One approach is to think of the indices of the rows of your data that make up each line of the desired output. Using your data:

mat <- matrix(c("A","B","A",
                "A","A","B",
                "B","A","A"), ncol = 3, byrow = TRUE)

I create those indices using expand.grid(). The first row of your output is formed by the concatenation of row 1 of mat with row 1 of mat, and so on. These indices are produced as follows

> ind <- expand.grid(r1 = 1:3, r2 = 1:3)
> ind
  r1 r2
1  1  1
2  2  1
3  3  1
4  1  2
5  2  2
6  3  2
7  1  3
8  2  3
9  3  3

Note that to get what your output shows we need to take columns r2 then r1 rather than the other way round.

Now I just index mat with the second column of ind and the first column of ind and supply that to paste0() the output from which is a vector so we need to reshape it to a matrix.

> matrix(paste0(mat[ind[,2], ], mat[ind[,1], ]), ncol = 3)
      [,1] [,2] [,3]
 [1,] "AA" "BB" "AA"
 [2,] "AA" "BA" "AB"
 [3,] "AB" "BA" "AA"
 [4,] "AA" "AB" "BA"
 [5,] "AA" "AA" "BB"
 [6,] "AB" "AA" "BA"
 [7,] "BA" "AB" "AA"
 [8,] "BA" "AA" "AB"
 [9,] "BB" "AA" "AA"

The paste0() step returns a vector of the pasted strings:

> paste0(mat[ind[,2], ], mat[ind[,1], ])
 [1] "AA" "AA" "AB" "AA" "AA" "AB" "BA" "BA" "BB" "BB" "BA" "BA" "AB" "AA" "AA"
[16] "AB" "AA" "AA" "AA" "AB" "AA" "BA" "BB" "BA" "AA" "AB" "AA"

The trick as to why the matrix restructuring shown above works is to note that the entries in the output from paste0() are in column-major order because of how the index ind was formed. Essentially the two arguments passed to paste0() are:

> mat[ind[,2], ]
      [,1] [,2] [,3]
 [1,] "A"  "B"  "A" 
 [2,] "A"  "B"  "A" 
 [3,] "A"  "B"  "A" 
 [4,] "A"  "A"  "B" 
 [5,] "A"  "A"  "B" 
 [6,] "A"  "A"  "B" 
 [7,] "B"  "A"  "A" 
 [8,] "B"  "A"  "A" 
 [9,] "B"  "A"  "A" 
> mat[ind[,1], ]
      [,1] [,2] [,3]
 [1,] "A"  "B"  "A" 
 [2,] "A"  "A"  "B" 
 [3,] "B"  "A"  "A" 
 [4,] "A"  "B"  "A" 
 [5,] "A"  "A"  "B" 
 [6,] "B"  "A"  "A" 
 [7,] "A"  "B"  "A" 
 [8,] "A"  "A"  "B" 
 [9,] "B"  "A"  "A"

R treats each as a vector and hence the output is a vector, but because R stores matrices by columns, we fill our output matrix with the pasted strings by columns also.

于 2012-09-18T10:46:10.663 回答
1

您可能不需要几个循环来获得输出,这里有一个建议:

首先,让我们生成您的样本矩阵:

M <- matrix(c("A","B","A","A","A","B","B","A","A"), ncol = 3, byrow = TRUE)
rownames(M) <- c("Line1","Line2","Line3")
colnames(M) <- c("M1","M2","M3")

一个容易在向量中的项目之间生成所有可能对的方法是使用expand.grid()

d <- expand.grid(rownames(M), rownames(M))

在所需的输出中生成列 X 和 Y:

   Var1  Var2
1 Line1 Line1
2 Line2 Line1
3 Line3 Line1
4 Line1 Line2
5 Line2 Line2
6 Line3 Line2
7 Line1 Line3
8 Line2 Line3
9 Line3 Line3

然后,您可以做的是apply()将相应的 M1、M2、M3 值粘贴在一起的每一行的函数:

apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} )

它将生成正确的组合,但不是正确的格式(还):

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] "AA" "AA" "BA" "AA" "AA" "BA" "AB" "AB" "BB"
[2,] "BB" "AB" "AB" "BA" "AA" "AA" "BA" "AA" "AA"
[3,] "AA" "BA" "AA" "AB" "BB" "AB" "AA" "BA" "AA"

要向正确的方向翻转矩阵,您只需将其转置即可。

从那里,您可以一次将所有内容包装到数据框中:

df <- data.frame( d, t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
colnames(df) <- c("X","Y","M1","M2", "M3")

就在这里。

为了提高效率,您最终可以编写一个小函数,向其提交任何 M 矩阵。

get.it <- function(M){ 
    d <- expand.grid(rownames(M), rownames(M))
    e <- t(apply(d, 1, function(x) { paste(M[x[1],], paste(M[x[2],]), sep="")} ))
    output<- data.frame( d, e)
    colnames(output) <- c("X","Y","M1","M2","M3")
return(output)
}

并且get.it(M)应该工作!

于 2012-09-18T11:07:27.173 回答