0

I have multiple text files that I have imported using

colnames<-c("cellID", "X", "Y", "Area", "AVGFP", "DeviationGFP", "AvgRFP", "DeviationsRFP", "Slice", "GUI-ID")
stats <- apply(data.frame(list.files()), 1, read.table,sep="", header=F, col.names=colnames)
names(stats) <- paste0("slice",seq_along(1:40))

This is what slice1 from stats looks like :

   cellID         X          Y Area    AVGFP DeviationGFP   AvgRFP DeviationsRFP Slice GUI.ID
1       1  18.20775  26.309859  568 5.389085     7.803248 12.13028      5.569880     0      1
2       2  39.78755   9.505495  546 5.260073     6.638375 17.44505     17.220153     0      1
3       3  30.50000  28.250000    4 6.000000     4.000000  8.50000      1.914854     0      1
4       4  38.20233 132.338521  257 3.206226     5.124264 14.04669      4.318130     0      1
5       5  43.22467  35.092511  454 6.744493     9.028574 11.49119      5.186897     0      1
6       6  57.06534 130.355114  352 3.781250     5.713022 20.96591     14.303546     0      1
7       7  86.81765  15.123529 1020 6.043137     8.022179 16.36471     19.194279     0      1
8       8  75.81932 132.146417  321 3.666667     5.852172 99.47040     55.234726     0      1
9       9 110.54277  36.339233  678 4.159292     6.689660 12.65782      4.264624     0      1
10     10 127.83480  11.384886  569 4.637961     6.992881 11.39192      4.287963     0      1

All of the other data sets look the same except they all have varying row length (some go up to 2000 cells)

I want to take 1 column from each data.frame (slice1....slice40) and put it into a new data.frame. I want the new data.frame to have the column name and I want the column names in the new data.frame to be called slice1...slice40.

To summarize with specifics:

From each slice1-40, I want to take all of the values from AVGFP and put them in a new data.frame The new data.frame should be called "AVGFP" There should be 40 columns with headers "slice1, slice2, ... , slice40" There should be "NA" in each empty cell that arises from one slice being shorter than another.

I really appreciate any and all help. I have been fumbling around with apply, plyr, split, reshape, melt, merge, and aggregate with no luck.

4

1 回答 1

2

如果你想匹配cellID然后试试这个:

L <- lapply(stats, `[`, c("cellID","AVGFP"))

AVGFP <- Reduce(function(x,y)
         merge(x,y,by="cellID",all=TRUE,suffixes=c(ncol(x),ncol(x)+1)), L)

names(AVGFP)[-1] <- paste0("slice", 1:40)

如果您只想将列粘贴在一起,请尝试以下操作:

首先获取数据帧的最大长度:

maxL <- max(sapply(stats, nrow))

现在创建一个列表,其中每一列都用NAs 扩展到最大长度:

L <- lapply(stats, function(x) c(x$AVGFP, rep(NA, maxL-nrow(x))))

将列放在一个矩阵中:

M <- do.call(cbind, L)

强制数据框:

AVGFP <- as.data.frame(M)

添加您想要的名称:

names(AVGFP) <- paste0("slice", 1:40)
于 2013-09-28T16:32:57.340 回答