0

我将 csv 文件导入数据集。现在我想将从第 i 行开始的 48 行数据块复制到一个新数据帧中,然后跳过 3 个 48 行块,然后将第 4 个 48 行块附加到新数据帧的末尾,依此类推,直到数据框的结尾。我在这个问题上花了很多时间都没有成功。提前感谢任何可能的提示。

4

2 回答 2

2

非常简单的oneliner:

new.df <- old.df[ c( rep( F, i - 1 ), rep( T, 48 ), rep( F, 48 * 3 ), rep( T, 48 ) ), ]

但是,嘿,让我们让它更简单:

new.df <- old.df[ c( rep( F, i - 1 ), rep( c( T, F, F, F, T ), each=48 ) ), ]

甚至

new.df <- old.df[ i - 1 + which( rep( c( T, F, F, F, T ), each=48 ) ), ]

解释:

我们创建一个真/假值向量;将选择对应于 T 的行。我们使用 c() 来连接块。首先,我们跳过 i - 1 (F),然后我们取 48 (T),然后我们跳过 3 * 48,我们再次取一个 48。

于 2012-10-13T17:00:39.607 回答
0
df <- data.frame(x = 1:1000, y = rnorm(1000))
> dim(df)
[1] 1000    2
# see that it has 1000 rows.
# let's say I want to copy 48 rows from row 102
new_df <- df[102:(102+48), ]
# or I do it with a variable
i <- 102
j <- i + 48
new_df <- df[i:j, ]
# If you need an uneven range, just make a vector
# Either specify a range of rows or just row numbers
rows_i_want <- c(1:48, 52, 55, 100:120, 128)
new_new_df <- df[rows_i_want, ]

下面是一个通用函数的例子,它可以为任何data.frame

# This function takes a data.frame and a starting index and a block size
keep_rows <- function(df, i, block = 48) {
    # Grab the number of rows remaining in the df from i to end
    nr <- nrow(df[i:nrow(df), ])
    if(i>nr)
        stop("index is too high")

    start <- seq(i, nr, by = block)

    if(length(start)==1)
        stop("index is too high")

    end <- c(start[2:length(start)], nrow(df))
    df2 <- data.frame(start, end)
    ranges <- apply(df2, 1, function(x) { x[[1]]:x[[2]]})
    to_keep <- rep(c(T,F,F,F,T), floor(round(nr/block)))
    return(df[to_keep[1:length(ranges)],])
}
于 2012-10-13T16:58:50.717 回答