3

我得到的是 txt 文件(mydata.txt)中的数据,如下所示:

Variable, DateTime, Value, Quality
A, 01-01-1970 00:00:00, 0, 0
A, 01-01-1970 00:02:00, 2, 2
A, 01-01-1970 00:04:00, 4, 1
A, 01-01-1970 00:06:00, 6, 0
B, 01-01-1970 00:02:00, 0.2, 0
B, 01-01-1970 00:04:00, 0.4, 1
B, 01-01-1970 00:06:00, 0.6, 1
B, 01-01-1970 00:10:00, 1.0, 0
C, 01-01-1970 00:00:00, 20.0, 0
C, 01-01-1970 00:04:00, 16.0, 0
C, 01-01-1970 00:08:00, 12.0, 3

我可以将它加载到 R 中而不会出现问题

read.csv("mydata.txt", header = TRUE, sep = ",")

或者

read.table("mydata.txt", header = TRUE, sep = ",")

但是我想在 r 中使用的是这样的:

DateTime, A_Value, A_Quality, B_Value, B_Quality, C_Value, C_Quality
01-01-1970 00:00:00, 0, 0, NA, NA, 20.0, 0
01-01-1970 00:02:00, 2, 2, 0.2, 0, NA, NA
01-01-1970 00:04:00, 4, 1, 0.4, 1, 16.0, 0
01-01-1970 00:06:00, 6, 0, 0.6, 1, NA, NA
01-01-1970 00:08:00, NA, NA, NA, NA, 12.0, 3
01-01-1970 00:10:00, NA, NA, 1.0, 0, NA, NA

(其中第一列是日期/时间类型)。

我不知道我的文件中有哪些或多少个不同的变量(即 A、B、... Z),我也不知道它们的名称——我只知道它们的列。

如何从文本文件中的数据集到我想在 R 中使用的数据集?

提前致谢!

4

2 回答 2

5

正常读取数据:

mydata <- read.table("mydata.txt", header = TRUE, sep = ",")

然后使用几种方法之一将其从所谓的“长”格式“重塑”为“宽”格式。

这只是基础 R 中的 1 行,使用reshape

reshape(mydata, direction = "wide", idvar = "DateTime", timevar = "Variable")
#                DateTime Value.A Quality.A Value.B Quality.B Value.C Quality.C
# 1   01-01-1970 00:00:00       0         0      NA        NA      20         0
# 2   01-01-1970 00:02:00       2         2     0.2         0      NA        NA
# 3   01-01-1970 00:04:00       4         1     0.4         1      16         0
# 4   01-01-1970 00:06:00       6         0     0.6         1      NA        NA
# 8   01-01-1970 00:10:00      NA        NA     1.0         0      NA        NA
# 11  01-01-1970 00:08:00      NA        NA      NA        NA      12         3
于 2013-02-02T09:52:59.650 回答
4

您可以使用以下reshape2软件包执行此操作:

第一步:melt你的data.frame

require(reshape2)
df.m <- melt(df, id.var = 1:2) # changed names(df)[1:2] to 1:2 (following @Anandamahto's comment)

第二步:cast到结果:

dcast(df.m, DateTime ~ Variable + variable, fill=NA)

#               DateTime A_Value A_Quality B_Value B_Quality C_Value C_Quality
# 1  01-01-1970 00:00:00       0         0      NA        NA      20         0
# 2  01-01-1970 00:02:00       2         2     0.2         0      NA        NA
# 3  01-01-1970 00:04:00       4         1     0.4         1      16         0
# 4  01-01-1970 00:06:00       6         0     0.6         1      NA        NA
# 5  01-01-1970 00:08:00      NA        NA      NA        NA      12         3
# 6  01-01-1970 00:10:00      NA        NA     1.0         0      NA        NA
于 2013-02-02T09:46:58.097 回答