7

为避免“重复”关闭请求:我知道如何读取 Excel 命名范围;下面的代码中给出了示例。这是关于 Excel 中的“真实”表格。

Excel2007及更高版本有表格的有用概念:可以将范围转换为表格,避免排序和重新排列时的麻烦。当您在 Excel 范围内创建表格时,它会获得一个默认名称(Tabelle1在以下示例中为德语版本TableName),但您还可以简单地命名表格的范围 ( TableAsRangeName);如 Excel 范围名称编辑器中的图标所示,这两者的处理方式似乎不同。

我无法从 R 中读取这些表(严格意义上)。唯一已知的解决方法是使用 CSV 中间,或将表转换为正常的命名范围,当您在单元格中使用列名时,这会产生令人讨厌的不可逆转的副作用参考; 这些将转换为 A1 表示法。

下面的示例显示了该问题。您的里程可能会因 32/64 位 ODBC 驱动程序和 32/64 位 Java 的不同组合而异

# Read Excel Tables (not simply named ranges)
# Test Computer: 64 Bit Windows 7, R 32 bit  
# My ODBC drivers are 32 bit
library(RODBC)
# Test file has three ranges
# NonTable Simple named range
# TableName Name of table 
# TableAsRangeName Named Range covering the above table
sampleFile = "ExcelTables.xlsx"
if (!file.exists(sampleFile)){
  download.file("http://www.menne-biomed.de/uni/ExcelTables.xlsx",sampleFile)
  # Or do it manually, if this fails
}
# ODBC
channel = odbcConnectExcel2007(sampleFile)
sqlQuery(channel, "SELECT * from NonTable") # Ok
sqlQuery(channel, "SELECT * from TableName") # Could not find range
sqlQuery(channel, "SELECT * from TableAsRangeName") # Could not find range
close(channel)

# gdata has read.xls, but seems not to support named regions

library(xlsx)
wb = loadWorkbook(sampleFile)
getRanges(wb) # This one fails already with "TableName" does not exist
ws = getSheets(wb)[[1]]
readRange("NonTable",ws) # Invalid range address
readRange("TableName",ws) # Invalid range address
readRange("TableAsRangeName",ws) # Invalid range address

# my machine requires 64 bit for this one; depends on your Java installation
sampleFile = "ExcelTables.xlsx"
library(XLConnect) # requires Java
readNamedRegionFromFile(sampleFile,"NonTable") # OK
readNamedRegionFromFile(sampleFile,"TableName") # "TableName" does not exist
readNamedRegionFromFile(sampleFile,"TableAsRangeName") # NullPointerException

wb <- loadWorkbook(sampleFile)
readNamedRegion(wb,"NonTable") # Ok
readNamedRegion(wb,"TableName") # does not exist
readNamedRegion(wb,"TableAsRangeName") # Null Pointer
4

3 回答 3

4

我在XLConnect中添加了一些对 Excel 表的初始支持。请在https://github.com/miraisolutions/xlconnect上找到 github 上的最新更改

下面是一个小样本:

require(XLConnect)
sampleFile = "ExcelTables.xlsx"
wb = loadWorkbook(sampleFile)
readTable(wb, sheet = "ExcelTable", table = "TableName")

请注意,Excel 表格与工作表相关联。据我所知,可以将多个具有相同名称的表关联到不同的工作表。出于这个原因,有一个sheet-argument to readTable

于 2013-07-17T20:04:32.383 回答
3

表定义存储在 XML 中是正确的。

sampleFile = "ExcelTables.xlsx"
unzip(sampleFile, exdir = 'test')
library(XML)
tData <- xmlParse('test/xl/tables/table1.xml')
tables <- xpathApply(tData, "//*[local-name() = 'table']", xmlAttrs)
[[1]]
            id           name    displayName            ref totalsRowShown 
           "1"    "TableName"    "TableName"        "G1:I4"            "0" 
library(XLConnect)

readWorksheetFromFile(sampleFile, sheet = "ExcelTable", region = tables[[1]]['ref'], header = TRUE)
    Name Age AgeGroup
1  Anton  44        4
2 Bertha  33        3
3  Cäsar  21        2

根据您的情况,您可以在 XML 文件中搜索适当的数量。

于 2013-07-17T09:31:14.093 回答
0

后期补充:

readxl::readxl可以读取“真实”表格,当您想要读取数据帧/小标题时,这可能是最不麻烦的解决方案。

** 在@Jamzy 评论之后 ** 我又试了一次,但无法读取命名范围。当时是假阳性还是现在是假阴性???

于 2017-03-21T07:52:08.820 回答