我有一个 100 万行的文件,一旦读取readLines
可以压缩为:
prob <- readLines("offendingFile.txt")
dput(prob)
c("000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90300105 |V-1 MUIMERP NALBOC |6.0000|30.820000|.0000|.00000000000000|6.0000|458114.67",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90400105 |V-2 MUIMERP NALBOC |3.0000|29.170000|.0000|.00000000000000|3.0000|169750.62",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime",
"", " ||90700101 |V-OCIMONOCE LOREMIPSUM |12.0000|5.980000|.0000|.00000000000000|12.0000|107118.18",
"000815004980|Odrareg Oinotna Namzug S. En C.S. |YUMBO |Rozo (Palmira) ALG 76520 |114|80041726|20140424|4132636|20140425|P|PED.ELE/100099-114 |Corregimiento de palmira"
)
我想删除文件中出现的 LFLF 序列和空格(这将导致删除第 2、5 和 8行并将第3 行附加到 1;6 到 4 和 9 到 7(原始行编号))。所以我尝试了:
prob2 <- gsub("\n {2,}", "", prob) # didn't do anything
gsub("[\r\n] {2,}", "", prob)
gsub("\r?\n {2,}|\r {2,}", "", prob)
最后两行是从这个 SO post借来的。
我应该如何进行?
预期输出:
dput(prob2)
c("000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90300105 |V-1 MUIMERP NALBOC |6.0000|30.820000|.0000|.00000000000000|6.0000|458114.67",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90400105 |V-2 MUIMERP NALBOC |3.0000|29.170000|.0000|.00000000000000|3.0000|169750.62",
"000005928484|Name Nmee Leonel |YUMBO |El Placer de El Cerrito ALG 76248 |114|80041725|20140424|4132638|20140425|P|PED.ELE/100098-114 |Corregimiento de amaime ||90700101 |V-OCIMONOCE LOREMIPSUM |12.0000|5.980000|.0000|.00000000000000|12.0000|107118.18",
"000815004980|Odrareg Oinotna Namzug S. En C.S. |YUMBO |Rozo (Palmira) ALG 76520 |114|80041726|20140424|4132636|20140425|P|PED.ELE/100099-114 |Corregimiento de palmira"
)