Mac OS X 上的 R 2.13.1。我正在尝试导入一个数据文件,其中包含一个点作为千位分隔符和逗号作为小数点,以及尾随负值的减号。
基本上,我正在尝试从以下转换:
"A|324,80|1.324,80|35,80-"
到
V1 V2 V3 V4
1 A 324.80 1324.8 -35.80
现在,以交互方式执行以下两项工作:
gsub("\\.","","1.324,80")
[1] "1324,80"
gsub("(.+)-$","-\\1", "35,80-")
[1] "-35,80"
并将它们结合起来:
gsub("\\.", "", gsub("(.+)-$","-\\1","1.324,80-"))
[1] "-1324,80"
但是,我无法从 read.data 中删除千位分隔符:
setClass("num.with.commas")
setAs("character", "num.with.commas", function(from) as.numeric(gsub("\\.", "", sub("(.+)-$","-\\1",from))) )
mydata <- "A|324,80|1.324,80|35,80-"
mytable <- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas"))
Warning messages:
1: In asMethod(object) : NAs introduced by coercion
2: In asMethod(object) : NAs introduced by coercion
3: In asMethod(object) : NAs introduced by coercion
mytable
V1 V2 V3 V4
1 A NA NA NA
请注意,如果我从“\\.”更改 到函数中的“,”,事情看起来有点不同:
setAs("character", "num.with.commas", function(from) as.numeric(gsub(",", "", sub("(.+)-$","-\\1",from))) )
mytable <- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas"))
mytable
V1 V2 V3 V4
1 A 32480 1.3248 -3580
我认为问题在于带有 dec="," 的 read.data 将传入的 "," 转换为 "." 在调用 as(from, "num.with.commas") 之前,输入字符串可以是例如“1.324.80”。
我希望 as("1.123,80-","num.with.commas") 返回 -1123.80 和 as("1.100.123,80", "num.with.commas") 返回 1100123.80。
如何让我的 num.with.commas 替换输入字符串中除最后一个小数点之外的所有内容?
更新:首先,我添加了负前瞻并让 as() 在控制台中工作:
setAs("character", "num.with.commas", function(from) as.numeric(gsub("(?!\\.\\d\\d$)\\.", "", gsub("(.+)-$","-\\1",from), perl=TRUE)) )
as("1.210.123.80-","num.with.commas")
[1] -1210124
as("10.123.80-","num.with.commas")
[1] -10123.8
as("10.123.80","num.with.commas")
[1] 10123.8
但是, read.table 仍然有同样的问题。在我的函数中添加一些 print() 表明 num.with.commas 实际上得到了逗号而不是重点。
所以我目前的解决方案是从“,”替换为“。” 在 num.with.commas 中。
setAs("character", "num.with.commas", function(from) as.numeric(gsub(",","\\.",gsub("(?!\\.\\d\\d$)\\.", "", gsub("(.+)-$","-\\1",from), perl=TRUE))) )
mytable <- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas"))
mytable
V1 V2 V3 V4
1 A 324.8 1101325 -35.8