我很难弄清楚如何将一些宽数据转换为长格式。我有三列字符串数据(A1_R00_FillerNP
、、A1_R01_ADV
和A1_R02_1stEmbV
),我想将它们融合到一列(WordCountRegion
)中,这样对于每个主题和项目,正确的单词将从这三列之一映射到新的WordCountRegion
列.
使用melt
下面代码中的简单方法可以让我成功:
(注意:中的奇怪字符df
无关紧要-请忽略它们)
df <- structure(list(Subject = c(101L, 101L, 101L, 101L, 101L, 101L,
101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L,
101L), condition = structure(c(2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L,
3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L), .Label = c("P", "R",
"S"), class = "factor"), item = c(101L, 102L, 103L, 101L, 102L,
103L, 101L, 102L, 103L, 101L, 102L, 103L, 101L, 102L, 103L, 101L,
102L, 103L), A1_R00_FillerNP = structure(c(3L, 2L, 1L, 3L, 2L,
1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L), .Label = c("SÌÇna d_r allvarliga konsekvenser",
"SÌÇna d_r fina _ppeltr_d", "SÌÇna d_r gamla skottk_rror"
), class = "factor"), A1_R01_ADV = structure(c(1L, 1L, 2L, 1L,
1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L), .Label = c("alltid",
"f_rresten"), class = "factor"), A1_R02_1stEmbV = structure(c(3L,
2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L,
1L), .Label = c("diskuterade", "stod", "tv_ttade"), class = "factor"),
RT = c(0L, 149L, 247L, 272L, 171L, 245L, 317L, 0L, 233L,
0L, 981L, 750L, 272L, 171L, 334L, 317L, 0L, 233L), Region = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L,
3L, 3L), .Label = c("R00", "R01", "R02"), class = "factor"),
RegionType = structure(c(3L, 3L, 3L, 2L, 2L, 2L, 1L, 1L,
1L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("1stEmbV",
"ADV", "FillerNP"), class = "factor"), DV = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L), .Label = c("FIRST_FIXATION_DURATION", "GAZE_DURATION"
), class = "factor")), .Names = c("Subject", "condition",
"item", "A1_R00_FillerNP", "A1_R01_ADV", "A1_R02_1stEmbV", "RT",
"Region", "RegionType", "DV"), class = "data.frame", row.names = c(NA,
-18L))
df1 = melt(df, measure.vars = c("A1_R00_FillerNP","A1_R01_ADV","A1_R02_1stEmbV"), var = "WordCountRegion")
问题是这段代码错误地打破了跨区域的单词。我最终得到如下输出,其中单词不会按照 的指定中断Region
,而是扩展到 的值Region
,如WordCountRegion
和所示value
。很明显,如果我要使用它,那么我需要某种额外的规范,以便 melt() 能够正确地破坏数据。我只是不确定如何做到这一点(或者是否可以在melt()中完成)。
Subject condition item RT Region RegionType DV WordCountRegion value
1 101 R 101 0 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
2 101 P 102 149 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
3 101 S 103 247 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
4 101 R 101 272 R01 ADV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
5 101 P 102 171 R01 ADV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
6 101 S 103 245 R01 ADV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
7 101 R 101 317 R02 1stEmbV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
8 101 P 102 0 R02 1stEmbV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
9 101 S 103 233 R02 1stEmbV FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
10 101 R 101 0 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
11 101 P 102 981 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
12 101 S 103 750 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
13 101 R 101 272 R01 ADV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
14 101 P 102 171 R01 ADV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
15 101 S 103 334 R01 ADV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
16 101 R 101 317 R02 1stEmbV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
17 101 P 102 0 R02 1stEmbV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
18 101 S 103 233 R02 1stEmbV GAZE_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
19 101 R 101 0 R00 FillerNP FIRST_FIXATION_DURATION A1_R01_ADV alltid
20 101 P 102 149 R00 FillerNP FIRST_FIXATION_DURATION A1_R01_ADV alltid
21 101 S 103 247 R00 FillerNP FIRST_FIXATION_DURATION A1_R01_ADV f_rresten
有没有一种方法可以修改melt()
以使它们对齐/匹配Region
,如下面的示例所示:
Subject condition item RT Region RegionType DV WordCountRegion value
1 101 R 101 0 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
2 101 P 102 149 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
3 101 S 103 247 R00 FillerNP FIRST_FIXATION_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
4 101 R 101 272 R01 ADV FIRST_FIXATION_DURATION A1_R01_ADV alltid
5 101 P 102 171 R01 ADV FIRST_FIXATION_DURATION A1_R01_ADV alltid
6 101 S 103 245 R01 ADV FIRST_FIXATION_DURATION A1_R01_ADV f_rresten
7 101 R 101 317 R02 1stEmbV FIRST_FIXATION_DURATION A1_R02_1stEmbV tv_ttade
8 101 P 102 0 R02 1stEmbV FIRST_FIXATION_DURATION A1_R02_1stEmbV stod
9 101 S 103 233 R02 1stEmbV FIRST_FIXATION_DURATION A1_R02_1stEmbV diskuterade
10 101 R 101 0 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r gamla skottk_rror
11 101 P 102 981 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r fina _ppeltr_d
12 101 S 103 750 R00 FillerNP GAZE_DURATION A1_R00_FillerNP SÌÇna d_r allvarliga konsekvenser
或者,如果我完全使用了错误的功能,有人可以指点我一个更好的解决方案吗?也许我需要一些可以进行实际查找的东西?