我遇到了同样的问题,它的根源是 PoiItemReader 中的 org.springframework.batch.item.excel.poi.PoiSheet 类。问题发生在public String[] getRow(final int rowNumber)
获取 org.apache.poi.ss.usermodel.Row 对象并在检测到行中每一列的类型后将其转换为字符串数组的方法中。在这个方法中,我们有代码:
switch (cellType) {
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
Date date = cell.getDateCellValue();
cells.add(String.valueOf(date.getTime()));
} else {
cells.add(String.valueOf(cell.getNumericCellValue()));
}
break;
case BOOLEAN:
cells.add(String.valueOf(cell.getBooleanCellValue()));
break;
case STRING:
case BLANK:
cells.add(cell.getStringCellValue());
break;
case ERROR:
cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
break;
default:
throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
}
其中对标识为 NUMERIC 的单元格的处理是cells.add(String.valueOf(cell.getNumericCellValue()))
. 在这一行中,单元格值被转换为 double ( cell.getNumericCellValue()
) 并且这个 double 被转换为 String ( String.valueOf()
)。问题发生在该String.valueOf()
方法中,如果数字太大(> = 10000000)或太小(<0.001),则会生成科学记数法,并将“.0”放在整数值上。
作为该行的替代方法cells.add(String.valueOf(cell.getNumericCellValue()))
,您可以使用
DataFormatter formatter = new DataFormatter();
cells.add(formatter.formatCellValue(cell));
这会将单元格的确切值作为字符串返回给您。但是,这也意味着您的十进制数字将取决于区域设置(您将收到来自为英国或印度配置的 Excel 中保存的文档中的字符串“2.5”和来自法国或巴西的字符串“2,5”)。
为了避免这种依赖,我们可以使用https://stackoverflow.com/a/25307973/9184574上提供的解决方案:
DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
df.setMaximumFractionDigits(340);
cells.add(df.format(cell.getNumericCellValue()));
这会将单元格转换为双精度,然后将其格式化为英文模式,无需科学记数法或在整数中添加“.0”。
我对 CustomPoiSheet(原始 PoiSheet 的小改编)的实现是:
class CustomPoiSheet implements Sheet {
protected final org.apache.poi.ss.usermodel.Sheet delegate;
private final int numberOfRows;
private final String name;
private FormulaEvaluator evaluator;
/**
* Constructor which takes the delegate sheet.
*
* @param delegate the apache POI sheet
*/
CustomPoiSheet(final org.apache.poi.ss.usermodel.Sheet delegate) {
super();
this.delegate = delegate;
this.numberOfRows = this.delegate.getLastRowNum() + 1;
this.name=this.delegate.getSheetName();
}
/**
* {@inheritDoc}
*/
@Override
public int getNumberOfRows() {
return this.numberOfRows;
}
/**
* {@inheritDoc}
*/
@Override
public String getName() {
return this.name;
}
/**
* {@inheritDoc}
*/
@Override
public String[] getRow(final int rowNumber) {
final Row row = this.delegate.getRow(rowNumber);
if (row == null) {
return null;
}
final List<String> cells = new LinkedList<>();
final int numberOfColumns = row.getLastCellNum();
for (int i = 0; i < numberOfColumns; i++) {
Cell cell = row.getCell(i);
CellType cellType = cell.getCellType();
if (cellType == CellType.FORMULA) {
FormulaEvaluator evaluator = getFormulaEvaluator();
if (evaluator == null) {
cells.add(cell.getCellFormula());
} else {
cellType = evaluator.evaluateFormulaCell(cell);
}
}
switch (cellType) {
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
Date date = cell.getDateCellValue();
cells.add(String.valueOf(date.getTime()));
} else {
// Returns numeric value the closer possible to it's value and shown string, only formatting to english format
// It will result in an integer string (without decimal places) if the value is a integer, and will result
// on the double string without trailing zeros. It also suppress scientific notation
// Regards to https://stackoverflow.com/a/25307973/9184574
DecimalFormat df = new DecimalFormat("0", DecimalFormatSymbols.getInstance(Locale.ENGLISH));
df.setMaximumFractionDigits(340);
cells.add(df.format(cell.getNumericCellValue()));
//DataFormatter formatter = new DataFormatter();
//cells.add(formatter.formatCellValue(cell));
//cells.add(String.valueOf(cell.getNumericCellValue()));
}
break;
case BOOLEAN:
cells.add(String.valueOf(cell.getBooleanCellValue()));
break;
case STRING:
case BLANK:
cells.add(cell.getStringCellValue());
break;
case ERROR:
cells.add(FormulaError.forInt(cell.getErrorCellValue()).getString());
break;
default:
throw new IllegalArgumentException("Cannot handle cells of type '" + cell.getCellTypeEnum() + "'");
}
}
return cells.toArray(new String[0]);
}
private FormulaEvaluator getFormulaEvaluator() {
if (this.evaluator == null) {
this.evaluator = delegate.getWorkbook().getCreationHelper().createFormulaEvaluator();
}
return this.evaluator;
}
}
而我调用CustomPoiSheet的CustomPoiItemReader(对原始PoiItemReader的小改编)的实现:
public class CustomPoiItemReader<T> extends AbstractExcelItemReader<T> {
private Workbook workbook;
@Override
protected Sheet getSheet(final int sheet) {
return new CustomPoiSheet(this.workbook.getSheetAt(sheet));
}
public CustomPoiItemReader(){
super();
}
@Override
protected int getNumberOfSheets() {
return this.workbook.getNumberOfSheets();
}
@Override
protected void doClose() throws Exception {
super.doClose();
if (this.workbook != null) {
this.workbook.close();
}
this.workbook=null;
}
/**
* Open the underlying file using the {@code WorkbookFactory}. We keep track of the used {@code InputStream} so that
* it can be closed cleanly on the end of reading the file. This to be able to release the resources used by
* Apache POI.
*
* @param inputStream the {@code InputStream} pointing to the Excel file.
* @throws Exception is thrown for any errors.
*/
@Override
protected void openExcelFile(final InputStream inputStream) throws Exception {
this.workbook = WorkbookFactory.create(inputStream);
this.workbook.setMissingCellPolicy(Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);
}
}