1

我刚刚从我们的一个数据记录器中下载了很多温度数据。数据框为我提供了 1691 小时内 87 个温度传感器的平均每小时温度观测值(所以这里有很多数据)。这看起来像这样

D1_A     D1_B     D1_C
13.43    14.39    12.33
12.62    13.53    11.56
11.67    12.56    10.36
10.83    11.62    9.47

我想将此数据集重塑为如下所示的矩阵:

#create a blank matrix 5 columns 131898 rows 
matrix1<-matrix(nrow=131898, ncol=5)
colnames(matrix1)<- c("year", "ID", "Soil_Layer", "Hour", "Temperature")

在哪里:

year is always "2012"
ID corresponds to the header ID (e.g. D1)
Soil_Layer corresponds to the second bit of the header (e.g. A, B, or C)
Hour= 1:1691 for each sensor 
and Temperature= the observed values in the original dataframe. 

这可以通过 r 中的 reshape 包来完成吗?这需要作为一个循环来完成吗?有关如何处理此数据集的任何输入都是有用的。干杯!

4

1 回答 1

2

我认为这可以满足您的需求...您可以利用package中的colsplit()and功能。不清楚您在哪里识别数据,所以我假设它是从原始数据集中订购的。如果不是这种情况,请更新您的问题:melt()reshape2Hour

library(reshape2)
#read in your data
x <- read.table(text = "

    D1_A    D1_B  D1_C
    13.43 14.39   12.33
    12.62 13.53   11.56
    11.67 12.56   10.36
    10.83 11.62   9.47
    9.98  10.77   9.04
    9.24  10.06   8.65
    8.89  9.55    8.78
    9.01  9.39    9.88
", header = TRUE)

#add hour index, if data isn't ordered, replace this with whatever 
#tells you which hour goes where
x$hour <- 1:nrow(x)
#Melt into long format
x.m <- melt(x, id.vars = "hour")
#Split into two columns
x.m[, c("ID", "Soil_Layer")] <- colsplit(x.m$variable, "_", c("ID", "Soil_Layer"))
#Add the year
x.m$year <- 2012

#Return the first 6 rows
head(x.m[, c("year", "ID", "Soil_Layer", "hour", "value")])
#----
  year ID Soil_Layer hour value
1 2012 D1          A    1 13.43
2 2012 D1          A    2 12.62
3 2012 D1          A    3 11.67
4 2012 D1          A    4 10.83
5 2012 D1          A    5  9.98
6 2012 D1          A    6  9.24
于 2013-04-30T00:08:51.203 回答