2

我经常面对 csv 文件,这些文件是用德语语言环境保存的,因此没有正确用逗号分隔,而是用分号分隔。这当然很容易通过定义分隔符来解决。但vroom与例如fread不提供还定义小数分隔符的可能性相反。因此,带有小数点分隔符的数值,会作为字符导入,或者在没有任何小数点分隔符的情况下错误地导入,从而导致非常大的数字。有没有办法直接定义小数点分隔符,类似于它的工作方式fread

library(vroom)
library(data.table)
   
df <- data.table(row.num = 1:10
                 , V1 = rnorm(10,10,5)
                 , V2 = rnorm(10,100,30))

fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")

fread(input = "vroom_test.csv", sep = ";", dec = ",")

vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))
4

1 回答 1

0

正如评论中已经提到的,解决方案相当简单,唯一需要做的就是locale()在调用中包含选项vroom。该选项的可能选项locale可以在其文档中找到。

vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))
于 2022-02-02T16:04:37.187 回答