1
  1. 您好,我的 XML 文件是这样的,有人可以帮我从 XML 文件中获取特定标签吗:

               <A1>
                <A><B>TEST1</B></A>
                <A><B>TEST2</B></A>
                <A><B>TEST3</B></A>
               </A1>
    
                <A1>
                <A><B>TEST4</B></A>
                <A><B>TEST5</B></A>
                <A><B>TEST6</B></A>
               </A1>
    

到目前为止,我正在像这样在 python 中处理它:

              for A in A1.findall('A'):
                   B = A.find('B').text
                   print B

      print B is giving me output like this:

          Test1
          Test2
          Test3
          Test4
          Test5
          Test6


   I want output from only first tag like this:

          Test1
          Test4


   What changes should I do to make it work?
4

1 回答 1

0

好吧,让我们再试一次。因此,在修订之后,我们想要搜索整个文档,并且每次父标签(A1)出现时,我们都想要获取每个集合中第一个标签的内容。

让我们尝试一个递归函数:

xmlData = open('xml.txt').readlines()
xml = ''.join(xmlData)

def grab(xml):
        """ Recursively walks through the whole XML data until <A1> is not found"""

        # base case, if the parent tag (<A1>) isn't there, then return
    if xml.find('<A1>') == -1:
        return 
    else:
                # find the location of the parent tag
        open_parent = xml.find('<A1>')
        close_parent = open_parent + 4

        # find the first child tag
        open_child = xml.find('<a><b>', close_parent)
        close_child = xml.find('</b></a>', open_child)

                # grab the data within that tag
        child_data = xml[open_child + 6 : close_child]
        print(child_data)

                # recursively call the grab() function
        return grab(xml[close_child:])

出于兴趣,您是否已经有一个您不介意分享的解决方案?

于 2013-03-18T09:48:47.097 回答