-1

Apologies if I'm not posting correctly. I'm new to R and this is my first post to stackoverflow. I've read as much as can to find a solution to my problem, but haven't been able to find something I can use.

I have some intensive longitudinal data that I'm trying to reshape. Currently it is in wide format and looks something like this:

Participant   D1_1_1   D1_1_2   D1_1_3   D1_1_4    D2_1_1   D2_1_2  etc...
P1               6        2        3        5        1         2
P2               4        9        3        6        4         1
P3               7        4        2        8        1         1
P4               1        5        1        1        6         7 
P5               2        0        8        2        1         4
etc..

The column variables refer to responses to a specific survey item, made on a particular day, at a particular times throughout the day.

So:

D1_1_1 = day 1, time 1, item 1

D1_1_2 = day 1, time 1, item 2

...

D4_3_7 = day4, time 3, item 7

In total, the data I have covers: 60 participants who have responded to 11 items, 4 times in a day, for 10 days (a total of 440 data points per participant).

I'm looking to get help on being able to manipulate this effectively into long format, so it could look, for example, like this:

Participant     Day     time    item 1   item 2 ... item 11
P1               1        1        6        2
P1               1        2        X        X
P1               1        3        X        X
P1               1        4        X        X
P1               2        1        1        4
etc..

Where X is the participant's response to a given survey item, on a particular day, at a particular time.

Any help would be much appreciated!

Cheers

4

2 回答 2

1

Ronak's answer works perfectly but there's no need to use extract: pivot_longer can already break the column into several ones:

library(tidyr)

df %>%
  pivot_longer(cols = -Participant, names_to = c("day", "time", "item"), 
               names_pattern = "(D\\d)_(\\d)_(\\d)") %>%
  pivot_wider(names_from = item, values_from = value, names_prefix = "Item")
#> # A tibble: 10 x 7
#>    Participant day   time  Item1 Item2 Item3 Item4
#>    <fct>       <chr> <chr> <int> <int> <int> <int>
#>  1 P1          D1    1         6     2     3     5
#>  2 P1          D2    1         1     2    NA    NA
#>  3 P2          D1    1         4     9     3     6
#>  4 P2          D2    1         4     1    NA    NA
#>  5 P3          D1    1         7     4     2     8
#>  6 P3          D2    1         1     1    NA    NA
#>  7 P4          D1    1         1     5     1     1
#>  8 P4          D2    1         6     7    NA    NA
#>  9 P5          D1    1         2     0     8     2
#> 10 P5          D2    1         1     4    NA    NA

Data:

df <- structure(list(Participant = structure(1:5, .Label = c("P1", 
"P2", "P3", "P4", "P5"), class = "factor"), D1_1_1 = c(6L, 4L, 
7L, 1L, 2L), D1_1_2 = c(2L, 9L, 4L, 5L, 0L), D1_1_3 = c(3L, 3L, 
2L, 1L, 8L), D1_1_4 = c(5L, 6L, 8L, 1L, 2L), D2_1_1 = c(1L, 4L, 
1L, 6L, 1L), D2_1_2 = c(2L, 1L, 1L, 7L, 4L)), class = "data.frame", 
row.names = c(NA, -5L))
于 2020-01-08T07:26:59.393 回答
0

Here is one way with pivot_longer + pivot_wider

library(dplyr)
library(tidyr)

pivot_longer(df, cols = -Participant, names_to = c("Day", "Time", "Item"), 
                 names_pattern = "D(\\d+)_(\\d+)_(\\d+)") %>%
    mutate(Item = paste0("Item",Item)) %>%
    pivot_wider(names_from = Item, values_from = value)

# A tibble: 10 x 7
#   Participant Day   Time  Item1 Item2 Item3 Item4
#   <fct>       <chr> <chr> <int> <int> <int> <int>
# 1 P1          1     1         6     2     3     5
# 2 P1          2     1         1     2    NA    NA
# 3 P2          1     1         4     9     3     6
# 4 P2          2     1         4     1    NA    NA
# 5 P3          1     1         7     4     2     8
# 6 P3          2     1         1     1    NA    NA
# 7 P4          1     1         1     5     1     1
# 8 P4          2     1         6     7    NA    NA
# 9 P5          1     1         2     0     8     2
#10 P5          2     1         1     4    NA    NA

We can also use extract using the same pattern as names_pattern in pivot_longer

pivot_longer(df, cols = -Participant) %>%
     extract(name, into = c("Day", "Time", "Item"), 
             regex = "D(\\d+)_(\\d+)_(\\d+)") %>%
     pivot_wider(names_from = Item, values_from = value)

data

df <- structure(list(Participant = structure(1:5, .Label = c("P1", 
"P2", "P3", "P4", "P5"), class = "factor"), D1_1_1 = c(6L, 4L, 
7L, 1L, 2L), D1_1_2 = c(2L, 9L, 4L, 5L, 0L), D1_1_3 = c(3L, 3L, 
2L, 1L, 8L), D1_1_4 = c(5L, 6L, 8L, 1L, 2L), D2_1_1 = c(1L, 4L, 
1L, 6L, 1L), D2_1_2 = c(2L, 1L, 1L, 7L, 4L)), class = "data.frame", 
row.names = c(NA, -5L))
于 2020-01-08T07:05:17.077 回答