如果有的话,这可以作为我解析/导入 Open XML Excel 文件数据的笔记。
我使用Open XML SDK 2.0创建了下面的代码,该代码成功地从一个 Excel 文件中提取 Excel 2007 数据,但未能在不同但相似的 Excel 文件中返回空字符串。我不知道我是否在正确的层次结构中正确使用了正确的方法,或者这是否会根据 Excel 文件版本(兼容的 2007、2010、2003 等)而改变。
是否有使用 Open XML SDK 2/2.5 从 Excel 文件中提取数据的一致方法?
我在下面制作了一些LinqPad示例,并用我的发现/示例问题进行了评论。
// Obtain a reference to the spreadsheet file
var doc = SpreadsheetDocument.Open(@"C:\MyExcelFile.xlsx", false);
// Only one WorkbookPart per spreadsheet
// Parts have a common Root property and differing methods to set them
// like WorkbookPart.Workbook sets the root for WorkbookPart
var workbookpart = doc.WorkbookPart;
// Find sheet names and ids
var sheets = doc.WorkbookPart.Workbook.Sheets;
var sheetname = sheets.Descendants<Sheet>().FirstOrDefault().Name.Value;
var sheetId = sheets.Descendants<Sheet>().FirstOrDefault().SheetId.Value;
var id = sheets.Descendants<Sheet>().FirstOrDefault().Id.Value;
// "Parts" hold collections even if that part only has one sub part
// How many Worksheet parts should we expect? One per Sheet?
var sheet1id = doc.WorkbookPart.Workbook.Descendants<Sheet>()
.Where(p => p.Name.Value == "Sheet1")
.Select(q => q.Id.Value).FirstOrDefault();
var worksheetpart = (WorksheetPart) workbookpart.GetPartById( sheet1id);
// Removed below because I don't know how to read WorksheetPart id
//var worksheetpart = workbookpart.WorksheetParts.FirstOrDefault();
// Worksheet is the root of WorkSheetPart
// Worksheet.Descendants<Column>() has usable min, max, width, customWidth
// Worksheet.Descendants<Row>() (or any other <type>) is empty
// Why is Worksheet.SheetDimension empty? How do you determine sheet size?
// Other than doc.WorkbookPart.GetPartById(sheetId) I have no idea
// how to determine this worksheet's id, name, or sheetid
// maybe their index as an Array matches the Sheets Id?
var worksheet = worksheetpart.Worksheet;
// Expect multiple SheetData? Why?
// .Descendants< ... >() retrieves objects cast to their proper type
var sheetdata = worksheet.Descendants<SheetData>().FirstOrDefault();
// Is this where we should access Rows or under Worksheet?
var row = sheetdata.Descendants<Row>().FirstOrDefault();
var cell = row.Descendants<Cell>().FirstOrDefault();
// Print out the Cell's values (Need to reference shared values elsewhere)
cell.CellReference.Value.Dump();
cell.DataType.Value.Dump();
cell.CellValue.Text.Dump();
// Close the spreadsheet
doc.Close();
(添加 DocumentFormat.OpenXml、WindowsBase.dll 的 LinqPad 查询引用和 DocumentFormat.OpenXml、DocumentFormat.OpenXml.Spreadsheet 和 DocumentFormat.OpenXml.Packaging 的命名空间导入以使其工作。)