1

我有一个如下所示的数据框:

   state1     state1_pp     state2     state2_pp    state3   state3_pp
   <chr>      <chr>         <chr>      <chr>         <chr>   <chr>  
 1 0          0.995614      F          0.004386      NA      0  
 2 0          1             NA         0             NA      0  
 3 0          1             NA         0             NA      0

我希望每一行的值是列名,数值是行值:

   0             F             NA   
   <chr>         <chr>         <chr>     
 1 0.995614      0.004386      0        
 2 1             0             0        
 3 1             0             0

我如何在 R 中做到这一点?

或者更复杂的场景:

  state1     state1_pp     state2     state2_pp    state3   state3_pp
 1 0          0.995614      F          0.004386      NA      0  
 2 A          1             B          0             C       0  
 3 D          0.7           B          0.3           NA      0

这就是我要的:

   0          A     D     F          B   C   NA
 1 0.995614   0     0     0.004386   0   0   0 
 2 0          1     0     0          0   0   0 
 3 0          0     0.7   0         0.3  0   0
4

2 回答 2

1

首先是一个警告,列名是数字(如1)或保留的 R 关键字(如NA)可能会导致各种错误。但如果你必须这样做,我建议如下:

library(dplyr)

# extract title row
headers <- df %>%
  head(1) %>%
  select(state1, state2, state3) %>%
  unlist(use.names = FALSE) %>%
  as.character()
# replace NA with "NA"
headers[is.na(headers)] = "NA"

# drop columns that are not wanted
new_df <- df %>%
  select(-state1, -state2, -state3)
# replace column names
colnames(new_df) <- headers

为了引用您的新列,您可能需要使用反引号:`

所以用你的新列名0,你可以调用但你不能调用or 。相反,您将不得不调用and 。FNAdf$Fdf$NAdf$1df$`1`df$`NA`

于 2020-05-05T04:43:49.877 回答
0

这是使用dplyrand的尝试tidyr

library(dplyr)
library(tidyr)


df %>%
  mutate(row = row_number()) %>%
  mutate_all(as.character) %>%
  pivot_longer(cols = -row) %>%
  mutate(name = sub('\\d+', '', name)) %>%
  group_by(name, row) %>%
  mutate(row1 = row_number()) %>%
  pivot_wider() %>%
  group_by(state, row) %>%
  mutate(row1 = row_number()) %>%
  pivot_wider(names_from = state, values_from = state_pp, 
              values_fill = list(state_pp = 0)) %>%
  ungroup() %>%
  select(-row, -row1)


# A tibble: 3 x 7
#  `0`      F        `NA`  A     B     C     D    
#  <chr>    <chr>    <chr> <chr> <chr> <chr> <chr>
#1 0.995614 0.004386 0     0     0     0     0    
#2 0        0        0     1     0     0     0    
#3 0        0        0     0     0.3   0     0.7  
于 2020-05-05T06:59:57.847 回答