1

我有一些文本文件(csv 格式)缺少一些文本限定符,例如第二行,以下五列(AMM):

"A",4,"","","HIGH STREET, 22","","","L6","3AA"
"B",2957136105,98,"M12ASE7569",AMM",1,,,"F",,20010514,"CR"
"C","T","UNKNOWN","",19000101
"D",4

我设法通过循环这些代码的列来发现不一致的行:(只需将以上内容保存在 txt 中)

library(plyr)
a <- readLines(path) # 
a <- rbind.fill(lapply(a, function(x) read.table(text=x, sep=",", as.is=T, quote="")))
> which(sapply(gregexpr("\"", a[,5]), length)==1 & grepl("\"", a[,5]))
[1] 1 2

但是在我的文件中,字段内有逗号(由于地址),因此我也得到误报......

你们中的一些人曾经遇到过这样的问题吗?如果是这样,你有什么想法?

4

0 回答 0