在我们公司,我们从为我们建议 CSV 文件的供应商那里收到分发信息。但是,他们无法转义文本字段中的引号字符,这会导致多行被忽略;使用文本字段解析器。
坏线的一个例子:
"CABLES TO GO","87029","5.0200","47","757120870296","87029", "WP SGL ALUM 1 1/2" GROMMET"
对应的代码片段是:
private static IEnumerable<string> ParseHelper(String line, int lineRead, Encoding enc)
{
MemoryStream mem = new MemoryStream(enc.GetBytes(line));
TextFieldParser readerTemp = new TextFieldParser(mem, enc) {CommentTokens = new[] {"#"}};
readerTemp.SetDelimiters(new[] { "," });
readerTemp.HasFieldsEnclosedInQuotes = true;
readerTemp.TextFieldType = FieldType.Delimited;
readerTemp.TrimWhiteSpace = true;
try
{
var items = readerTemp.ReadFields();
return items;
}
catch (MalformedLineException ex)
{
throw new MalformedLineException(String.Format(
"Line {0} is not valid and will be skipped: {1}\r\n{2}",
lineRead, readerTemp.ErrorLine, ex));
}
}
此外,该供应商无法更改源文件以转义这些引号。像这样的这些行的最佳解决方法是什么?