python - lxml 在遇到缺少的标签时停止

Question

我正在解析一些 XML 文件以提取特定标签。在这里有很多帮助，它正在处理我的测试文件。我现在有一个新问题；我的同事要我测试的下一个文件似乎缺少一些标签。

这是我目前拥有的代码：

with open('output.log', 'w') as f:
   for info in root.xpath('//xmlns:ProgramInformation', namespaces=nsmap):
      crid = (info.get('programId')) # retrieve crid
      title = (info.find('.//xmlns:Title', namespaces=nsmap).text) # retrieve title
      genre = (info.find('.//xmlns:Genre/xmlns:Name', namespaces=nsmap).text) # retrieve genre
      f.write('{}|{}|{}\n'.format(crid, title, genre))

'crid' 将始终存在，但似乎存在一些未生成标题和/或流派的问题。这会导致一切停止。

有没有办法让代码跳过丢失的标签（但仍然写 crid）并继续下一组，或者有一种方法将错误写入输出文件（代替丢失的标题或流派）。

score 0 · Accepted Answer

不幸的是，它没有那么紧凑，但你必须把它分开：

  titlex = info.find('.//xmlns:Title', namespaces=nsmap)
  title = titlex.text if titlex != None else ''

python - lxml 在遇到缺少的标签时停止

1 回答 1

Related

Reference