4

我有一个rep看起来像这样的数据框:

> head(rep)
     position chrom  value label  
[1,] "17408"  "chr1" "0"   "miRNA"
[2,] "17409"  "chr1" "0"   "miRNA"
[3,] "17410"  "chr1" "0"   "miRNA"
[4,] "17411"  "chr1" "0"   "miRNA"
[5,] "17412"  "chr1" "0"   "miRNA"
[6,] "17413"  "chr1" "0"   "miRNA"

如何从所有元素中删除引号?

注意:rep$positionandrep$value应该是numerictype,rep$chromandrep$label应该是charactertype。

4

3 回答 3

9

两个步骤:1)去掉引号,2)相应地转换列:

数据

x <- read.table(text='
position chrom  value label  
"\\"17408\\""  "\\"chr1\\"" "\\"0\\""   "\\"miRNA\\""
"\\"17409\\""  "\\"chr1\\"" "\\"0\\""   "\\"miRNA\\""'
, header=T)

1)去掉引号

library(stringr)
library(plyr)

del <- colwise(function(x) str_replace_all(x, '\"', ""))
x <- del(x)

2)相应地转换列

num <- colwise(as.numeric)    
x[c(1,3)] <- num(x[c(1,3)])
x

  position chrom value label
1    17408  chr1     0 miRNA
2    17409  chr1     0 miRNA
于 2014-02-09T11:06:18.117 回答
6

正如@Roland 所指出的,您有 a matrix,而不是 a data.frame,它们有不同的默认print方法。坚持使用matrix,您可以quote = FALSE显式设置print或使用noquote.

这是一个基本示例:

## Sample data
x <- matrix(c(17, "chr1", 0, "miRNA", 18, "chr1", 0, "miRNA"), nrow = 2, 
            byrow = TRUE, dimnames = list(
              NULL, c("position", "chrom", "value", "label")))

## Default printing
x
#      position chrom  value label  
# [1,] "17"     "chr1" "0"   "miRNA"
# [2,] "18"     "chr1" "0"   "miRNA"

## Two options to make the quotes disappear
print(x, quote = FALSE)
#      position chrom value label
# [1,] 17       chr1  0     miRNA
# [2,] 18       chr1  0     miRNA
noquote(x)
#      position chrom value label
# [1,] 17       chr1  0     miRNA
# [2,] 18       chr1  0     miRNA

此外,正如您自己发现的那样,将您的转换matrix为 adata.frame会使引号消失。如果每一列都是不同类型的数据(数字、字符、因子等), Adata.frame是一种更适合保存数据的结构。但是,将 a 转换matrix为 adata.frame并不会自动为您处理列的转换。相反,您可以使用type.convert(在创建data.frameusingread.table和 family 时也会使用):

y <- data.frame(x, stringsAsFactors = FALSE)
str(y)
# 'data.frame':  2 obs. of  4 variables:
#  $ position: chr  "17" "18"
#  $ chrom   : chr  "chr1" "chr1"
#  $ value   : chr  "0" "0"
#  $ label   : chr  "miRNA" "miRNA"
y[] <- lapply(y, type.convert)
str(y)
# 'data.frame':  2 obs. of  4 variables:
#  $ position: int  17 18
#  $ chrom   : Factor w/ 1 level "chr1": 1 1
#  $ value   : int  0 0
#  $ label   : Factor w/ 1 level "miRNA": 1 1
y
#   position chrom value label
# 1       17  chr1     0 miRNA
# 2       18  chr1     0 miRNA
于 2014-02-09T15:22:57.587 回答
2

我想我找到了答案。我拥有的不是data.frame,而是矩阵。将其转换为 data.frame 摆脱了引号。不过我还是想知道为什么......

rep <- data.frame(rep)
> head(rep)
  position chrom value label
1    17408  chr1     0 miRNA
2    17409  chr1     0 miRNA
3    17410  chr1     0 miRNA
4    17411  chr1     0 miRNA
5    17412  chr1     0 miRNA
6    17413  chr1     0 miRNA
于 2014-02-09T10:56:44.153 回答