想到的两个主要选项是strsplit
(如评论和@Ricardo 的回答中所述)和read.fwf
. 不会直接处理您的数据,但如果您使用该函数read.fwf
,它可以处理已读入的数据列。textConnection()
这是一个基本示例:
## Create a tab-separated file named "test.txt" in your working directory
cat("2001\tHAPLO1\tAAACAAGGAGGAGAAGGAAA\n",
"2001\tHAPLO2\tCAACAAAGAGGAGAAGGAAA\n",
"2002\tHAPLO1\tAAAAAAGGAGGAAAAGGAAA\n",
"20020\tHAPLO2\tCAACAAGGAGGAAGCAGAGC\n",
"20021\tHAPLO2\tCAACAAGGAGGAAGCAGAGC\n",
file = "test.txt")
## Read it in with `read.delim`
mydata <- read.delim("test.txt", header = FALSE, stringsAsFactors = FALSE)
## Use `read.fwf` on the third column
## Replace "widths" with whatever the maximum width is for that column
## If max width is not known, you can use something like
## `widths = rep(1, max(nchar(mydata$V3)))`
cbind(mydata[-3],
read.fwf(file = textConnection(mydata$V3), widths = rep(1, 20)))
# V1 V2 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
# 1 2001 HAPLO1 A A A C A A G G A G G A G A A G G A A A
# 2 2001 HAPLO2 C A A C A A A G A G G A G A A G G A A A
# 3 2002 HAPLO1 A A A A A A G G A G G A A A A G G A A A
# 4 20020 HAPLO2 C A A C A A G G A G G A A G C A G A G C
# 5 20021 HAPLO2 C A A C A A G G A G G A A G C A G A G C
注意:如果您不使用stringsAsFactors = FALSE
,则必须将file
参数更改为:
file = textConnection(as.character(mydata$V3))