4

当我尝试使用 读取 csv 文件data.table:fread(fn, sep='\t', header=T)时,它给出了“在这一行上观察到的不平衡”错误。数据有 3 个整数变量和 1 个字符串变量。csv 文件中的字符串没有用 括起来",是的,有一些包含"在字符串变量中的行并且"字符不是成对的。

我想知道是否可以让fread忽略"变量中的未配对并继续读取数据?谢谢。

这是示例数据(只有一条记录)

N_ID    VISIT_DATE  REQ_URL REQType
175931  2013-3-8 23:40:30   http://aaa.com/rest/api2.do?api=getSetMobileSession&data={"imei":"60893ZTE-CN13cd","appkey":"android_client","content":"Z0JiRA0qPFtWM3BYVltmcx5MWF9ZS0YLdW1ydXoqPycuJS8idXdlY3R0TGBtU   1
4

1 回答 1

6

UPDATE: Now implemented in v1.8.11

From NEWS :

fread now accepts quotes (both ' and ") in the middle of fields, whether the field starts with " or not, rather than the 'unbalanced quotes' error, #2694. Thanks to baidao for reporting. It was known and documented at the top of ?fread (text now removed). If a field starts with " it must end with " (necessary if the field separator itself is in the field contents). Embedded quotes can be in column names too. Newlines (\n) still can't be in quoted fields or quoted column names, yet.


Yes as @agstudy said, embedded quotes are a known documented problem not yet implemented since fread is new. Strictly speaking, I suppose these ones aren't embedded because the string in your example doesn't start with a quote, though.

Anyway, I've filed this as a bug report so it doesn't get forgotten. To be done in the next release. Thanks for highlighting.

#2694 : Strings including quotes but not starting with quote in fread

于 2013-04-19T11:46:16.013 回答