python - 使用 GDAL/OGR python 模块解析 osm.pbf 数据

Question

我正在尝试使用 python GDAL/OGR 模块从 OSM.PBF 文件中提取数据。

目前我的代码如下所示：

import gdal, ogr

osm = ogr.Open('file.osm.pbf')

## Select multipolygon from the layer
layer = osm.GetLayer(3) 
# Create list to store pubs
pubs = []
for feat in layer:
    if feat.GetField('amenity') == 'pub':
         pubs.append(feat)

虽然这段代码适用于 small.pbf 文件 (15mb)。但是，当解析大于 50mb 的文件时，出现以下错误：

 ERROR 1: Too many features have accumulated in points layer. Use OGR_INTERLEAVED_READING=YES MODE

当我打开此模式时：

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')

ogr 不再返回任何功能，即使在解析小文件时也是如此。

有谁知道这里发生了什么？

score 4 · Accepted Answer

感谢scai的回答，我能够弄清楚。

gdal.org/1.11/ogr/drv_osm.html 中提到的交错阅读所需的特殊阅读模式被翻译成一个工作 python 示例，可以在下面找到。

这是如何提取 .osm.pbf 文件中具有“amenity=pub”标签的所有特征的示例

import gdal, ogr

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')
osm = ogr.Open('file.osm.pbf')

# Grab available layers in file
nLayerCount = osm.GetLayerCount()

thereIsDataInLayer = True

pubs = []

while thereIsDataInLayer:

    thereIsDataInLayer = False

    # Cycle through available layers
    for iLayer in xrange(nLayerCount):

        lyr=osm.GetLayer(iLayer)

        # Get first feature from layer
        feat = lyr.GetNextFeature()

        while (feat is not None):

             thereIsDataInLayer = True

             #Do something with feature, in this case store them in a list
             if feat.GetField('amenity') == 'pub':
                 pubs.append(feat)

             #The destroy method is necessary for interleaved reading
             feat.Destroy()

             feat = lyr.GetNextFeature()

据我了解，需要while循环而不是for循环，因为使用交错读取方法时，无法获得集合的特征计数。

非常感谢您进一步澄清为什么这段代码会像它一样工作。

python - 使用 GDAL/OGR python 模块解析 osm.pbf 数据

1 回答 1

Related

Reference