0

I have a dataframe that looks like this:

               variable              Name Description value  SMTS
GTEX-N7MS-0007-SM-2D7W1 ENSG00000223972.4     DDX11L1     0 Blood
GTEX-N7MS-0007-SM-2D7W1 ENSG00000227232.4      WASH7P   158 Blood
GTEX-N7MS-0008-SM-4E3JI ENSG00000223972.4     DDX11L1     0  Skin
GTEX-N7MS-0008-SM-4E3JI ENSG00000227232.4      WASH7P   166  Skin
GTEX-N7MS-0011-R10A-SM-2HMJK ENSG00000223972.4     DDX11L1     0 Brain
GTEX-N7MS-0011-R10A-SM-2HMJK ENSG00000227232.4      WASH7P   209 Brain

I want to transform it such that the values in the Description column become the column names, and the values in the value column become the column values:

               variable   DDX11L1    WASH7P    SMTS
GTEX-N7MS-0007-SM-2D7W1         0       158   Blood
GTEX-N7MS-0008-SM-4E3JI         0       166    Skin
GTEX-N7MS-0011-R10A-SM-2HMJK    0       209   Brain

I tried using cast (e.g. dcast(final, value~Name) and other combinations too) but as I don't want any function (like mean, sum etc) to apply for the transformation, it returns me length of the objects. I just want the values as is. Any suggestions would be appreciated.

4

2 回答 2

2

这似乎给出了您正在寻找的结果:

library(reshape2)
dcast(mydf, variable + SMTS ~ Description, value.var="value")
#                       variable  SMTS DDX11L1 WASH7P
# 1      GTEX-N7MS-0007-SM-2D7W1 Blood       0    158
# 2      GTEX-N7MS-0008-SM-4E3JI  Skin       0    166
# 3 GTEX-N7MS-0011-R10A-SM-2HMJK Brain       0    209
于 2014-06-27T15:57:15.580 回答
2

尝试:

library(dplyr)
library(tidyr)

如果dat是数据集

 dat%>% select(-Name) %>% spread(Description, value)
#                          variable  SMTS DDX11L1 WASH7P
# 1      GTEX-N7MS-0007-SM-2D7W1 Blood       0    158
# 2      GTEX-N7MS-0008-SM-4E3JI  Skin       0    166
# 3 GTEX-N7MS-0011-R10A-SM-2HMJK Brain       0    209
于 2014-06-27T16:01:20.147 回答