I'm writing a custom Loader for pig. It is supposed to read delimited records that might span into multiple lines. Everything works, except that sometimes a split happens in the middle of a record and messes everything. I know RecordReader and InputFormat have to do with the place the files are split, but can't figure out how to make it work in my case. To me, it looks like the CSVExcelStorage should have the same problem, but I can't find any code to handle this.
问问题
449 次