r - 如何在 R 中组合两个不同分辨率的有序系列？

Question

我有一些钻孔地质数据，按从地表到某个总深度的深度排序。我希望将几套组合成一套，每套都有不同的分辨率。最高分辨率数据集具有所需的输出分辨率（它还具有均匀间隔的深度，而其他数据集则没有）。我有很多需要管理，因此手动编辑电子表格需要很长时间。

例如，以下是选定深度范围（约 151--152）的一些高分辨率数据：

data <-
structure(list(DEPTH = c(150.876, 151.0284, 151.1808, 151.3332, 
151.4856, 151.638, 151.7904, 151.9428, 152.0952, 152.2476), DT = c(435.6977, 
437.6732, 441.4934, 444.6542, 445.771, 444.4603, 443.5679, 444.5042, 
447.3567, 450.4373), GR = c(13.8393, 14.549, 15.7866, 16.9114, 
18.4841, 18.8695, 17.7494, 16.7178, 12.8839, 11.7309)), .Names = c("DEPTH", 
"DT", "GR"), row.names = c(NA, -10L), class = "data.frame")

（完整的日志数据文件要大得多，所以我不太清楚如何在此处设置以供您使用。相反，我取了一部分与下一个数据集中的间隔匹配；analyses）

还有一些分辨率较低的离散数值数据，其中深度的范围与上述logs数据不相等。此数据表示特定深度范围内给定长度的采样间隔，并且不会沿给定范围变化：

analyses <-
structure(list(from = c(151L, 198L, 284L, 480L), to = c(151.1, 
198.1, 284.1, 480.1), TC = c(1.276476312, 1.383553608, 1.46771308, 
1.125049954), DEN = c(1.842555733, 1.911724824, 1.997592565, 
NA), PORO = c(50.21947697, 44.26392579, 39.31309757, NA)), .Names = c("from", 
"to", "TC", "DEN", "PORO"), class = "data.frame", row.names = c(NA, 
-4L))

还有一些深度范围不等的低分辨率分类数据：

units <-
structure(list(from = c(0, 100, 450, 535, 617.89), to = c(100, 
450, 535, 617.89, 619.25), strat = structure(c(5L, 1L, 2L, 3L, 
4L), .Label = c("Formation A", "Formation B", 
"Group C", "Group D", "Unassigned"), class = "factor")), .Names = c("from", 
"to", "strat"), class = "data.frame", row.names = c(NA, -5L))

预期结果是第一个数据集分辨率的数据logs，以及来自第二个和第三个的合并数据。在这种情况下，它将导致此数据框：

DEPTH   DT  GR  TC  DEN PORO    Unit
150.8760    435.69  13.83   NA  NA  NA  Formation A
151.0284    437.67  14.54   1.27    1.84    50.21   Formation A
151.1808    441.49  15.78   NA  NA  NA  Formation A
151.3332    444.65  16.91   NA  NA  NA  Formation A
151.4856    445.77  18.48   NA  NA  NA  Formation A
151.6380    444.46  18.86   NA  NA  NA  Formation A
151.7904    443.56  17.74   NA  NA  NA  Formation A
151.9428    444.50  16.71   NA  NA  NA  Formation A
152.0952    447.35  12.88   NA  NA  NA  Formation A
152.2476    450.43  11.73   NA  NA  NA  Formation A

我尝试合并数据框，然后使用 na.approx 来填补空白，但问题是其中的许多变量logs都有我不想插入值的 NaN 或 NA——它们需要保持为NA。

score 1 · Accepted Answer

merge您可以使用、或加入您的 data.frames sqldf。

library(sqldf)

# If you know that each depth (in the first data.frame) 
# is in exactly one interval (in the second and third data.frames)
sqldf( "
  SELECT *
  FROM data A, analyses B, units C
  WHERE B.[from] <= A.DEPTH AND A.DEPTH < B.[to] -- Need to quote some of the column names
  AND   C.[from] <= A.DEPTH AND A.DEPTH < C.[to]
" )

# If each depth (in the first data.frame) 
# is in at most one interval (in the second and third data.frames)
sqldf( "
  SELECT *
  FROM data A
  LEFT JOIN analyses B ON B.[from] <= A.DEPTH AND A.DEPTH < B.[to]
  LEFT JOIN units    C ON C.[from] <= A.DEPTH AND A.DEPTH < C.[to]
  ORDER BY DEPTH
" )

r - 如何在 R 中组合两个不同分辨率的有序系列？

1 回答 1

Related

Reference