我一直在尝试在 groovy 中解析 csv 文件,目前使用库 org.apache.commons.csv 2.4。我的要求是 csv 单元格中有无效的数据值,例如无效字符,而不是在第一个无效行/单元格上抛出异常,我想收集这些异常并在 csv 文件中不断迭代直到结束,然后我将获得此 csv 文件具有的无效数据的完整列表。
出于这个目的,我尝试了多种使用这个 apache lib 的方法,但不幸的是,只要它使用 CSVParser.getNextRecord() 进行迭代,迭代器就会中止。
输入代码,如下所示:
def records = new CSVParser(reader, CSVFormat.EXCEL.withHeader().withIgnoreSurroundingSpaces())
// at this line, the iterator() inside CSVParser is always using getNextRecord() for its next() implementation, and it may throw exception on invalid char
records.each {record->
// if the exception is thrown from .each, that makes below try/catch in vain
try{
}catch(e){ //want collect Errors here }
}
那么,还有什么我应该在这个库中挖掘的吗?或者有人可以指出我另一个更可行的解决方案吗?非常感谢大家!
更新:示例 CSV
"Company code for WBS element","WBS Element","PS: Short description (1st text line)","Responsible Cost Center for WBS Element","OBJNR","WBS Status"
"1001","RE-01768-011","Opex - To present a paper on Career con","0000016400","PR00031497","X"
"1001","RE-01768-011","Opex - To present a paper on "Career con","0000016400","PR00031497","X"
第二个数据行包含无效字符"
,导致解析器抛出异常