我有许多txt
文件在由 ; 分隔的列中包含相同类型的数值数据。但是有些文件的列标题带有空格,而有些则没有(由不同的人创建)。有些有我不想要的额外列。
例如,一个文件可能有如下标题:
ASomeName; BSomeName; C(someName%)
而另一个文件头可能是
A Some Name; B Some Name; C(someName%); D some name
在调用“读取”命令之前,如何清除名称中的空格?
#These are the files I have
filenames<-list.files(pattern = "*.txt",recursive = TRUE,full.names = TRUE)%>%as_tibble()
#These are the columns I would like:
colSelect=c("Date","Time","Timestamp" ,"PM2_5(ug/m3)","PM10(ug/m3)","PM01(ug/m3)","Temperature(C)", "Humidity(%RH)", "CO2(ppm)")
#This is how I read them if they have the same columns
ldf <- vroom::vroom(filenames, col_select = colSelect,delim=";",id = "sensor" )%>%janitor::clean_names()
清理标题脚本
我编写了一个破坏性脚本,它将读取整个文件,清理空格标题,删除文件并重新写入(vroom 有时抱怨无法打开 X 数千个文件)使用相同的文件姓名。不是一种高效的做事方式。
cleanHeaders<-function(filename){
d<-vroom::vroom(filename,delim=";")%>%janitor::clean_names()
#print(head(d))
if (file.exists(filename)) {
#Delete file if it exists
file.remove(filename)
}
vroom::vroom_write(d,filename,delim = ";")
}
lapply(filenames,cleanHeaders)