我想从下图中的左表转到右表,但似乎无法找到背后的编码逻辑来使用 R 达到结果。
非常感谢您的帮助 !
我创建了一个最小的例子,应该做你想做的事。这里的主要问题是表达你的问题,因为我认为有比我更好的答案来将滞后值与模态匹配。
library(dplyr)
# --- v0 is your data simplified
v0 <- c("cA", "t1", "t2", "cB", "t3")
# --- indic tels us what are the groups
indic <- v0 %>% stringr::str_detect(string = ., pattern = "c") %>%
cumsum()
# --- here you can try the code line by line (without the %>% (pipe) operator to understand the code
dfr <- tibble(v0, indic)
dfr %>%
group_by(indic) %>%
mutate(v1 = v0[which(stringr::str_detect(v0, "c") )] ) %>%
ungroup() %>%
filter(! stringr::str_detect(v0, "c")) %>%
select(v1, v0)
#> # A tibble: 3 x 2
#> v1 v0
#> <chr> <chr>
#> 1 cA t1
#> 2 cA t2
#> 3 cB t3
# you could also use a loop
使用 base 的示例R
:
data <- c(
"cinema A", "17:45", "20:00", "cinema B", "13:00", "15:45", "16:00",
"cinema C", "08:20"
)
time_rows <- grep("cinema", data, invert = TRUE)
data.frame(
time = data[time_rows],
cinema = grep("cinema", data, value = TRUE)[cumsum(grepl("cinema", data))][time_rows]
)
如评论中所述,请为以后的帖子提供一些示例数据。在这种情况下,我根据您所附的图片为您制作了它。
有很多方法可以解决这个问题。这是一个三步法。
library(tidyverse)
library(stringr)
# Create the data
df <- tibble(
X1 = c("cinema A", 17.45, 20.00, "cinema B", 13.00, 15.45, 16.00, "cinema C", 8.20))
df
#> # A tibble: 9 x 1
#> X1
#> <chr>
#> 1 cinema A
#> 2 17.45
#> 3 20
#> 4 cinema B
#> 5 13
#> 6 15.45
#> 7 16
#> 8 cinema C
#> 9 8.2
# Step 1: detect where the cinema values are and copy them to a new column
df$cinema <- ifelse(str_detect(df$X1, "cinema"), df$X1, NA)
df
#> # A tibble: 9 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 cinema A cinema A
#> 2 17.45 <NA>
#> 3 20 <NA>
#> 4 cinema B cinema B
#> 5 13 <NA>
#> 6 15.45 <NA>
#> 7 16 <NA>
#> 8 cinema C cinema C
#> 9 8.2 <NA>
# Step 2: replace NA values in the new column with the values above
df <- fill(df, cinema)
df
#> # A tibble: 9 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 cinema A cinema A
#> 2 17.45 cinema A
#> 3 20 cinema A
#> 4 cinema B cinema B
#> 5 13 cinema B
#> 6 15.45 cinema B
#> 7 16 cinema B
#> 8 cinema C cinema C
#> 9 8.2 cinema C
# Step 3: remove the rows where X1 contains cinema information
df <- filter(df, !str_detect(df$X1, "cinema"))
df
#> # A tibble: 6 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 17.45 cinema A
#> 2 20 cinema A
#> 3 13 cinema B
#> 4 15.45 cinema B
#> 5 16 cinema B
#> 6 8.2 cinema C
由reprex 包(v0.3.0)于 2019-11-26 创建
这是一个解决方案base R
。
假设输入作为数据框给出,即:
df <- data.frame(X = c("cinema A", 17.45, 20.00, "cinema B", 13.00, 15.45, 16.00, "cinema C", 8.20))
> df
X
1 cinema A
2 17.45
3 20
4 cinema B
5 13
6 15.45
7 16
8 cinema C
9 8.2
以下代码可以帮助您获得右侧的表格:
lst <- split(df,findInterval(seq(nrow(df)),grep("cinema",df$X)-1,left.open = T))
res <- Reduce(rbind,lapply(lst, function(v) data.frame(ViewingTime = v[-1,],CinemaName = v[1,])))
输出res
如下所示:
> res
ViewingTime CinemaName
1 17.45 cinema A
2 20 cinema A
3 13 cinema B
4 15.45 cinema B
5 16 cinema B
6 8.2 cinema C