0

我正在尝试在 python 中执行以下操作。

我有一个包含以下内容的文件...

<VirtualHost>
  ServerName blah.com
  DocumentRoot /var/www/blah.com
</Virtualhost>

<VirtualHost>
  ServerName blah2.com
  DocumentRoot /var/www/blah2.com
</Virtualhost>

... etc

我想把这些虚拟主机容器中的每一个放在一个单独的文件中(或变量,我可以从那里工作)......

我已经能够在字符串之间获取数据,但不包括它们。所以输出将是......

<VirtualHost>
  ServerName blah2.com
  DocumentRoot /var/www/blah2.com
</Virtualhost>

...iterated through each container and not...
ServerName blah2.com
DocumentRoot /var/www/blah2.com

请让我知道这是否可以轻松完成。谢谢!

4

2 回答 2

0

findall 正则表达式可能有效:

import re

d = """
<VirtualHost>
  ServerName blah.com
  DocumentRoot /var/www/blah.com
</Virtualhost>
<VirtualHost>
  ServerName blah2.com
  DocumentRoot /var/www/blah2.com
</Virtualhost>
"""

matches = re.findall(r'<VirtualHost>(.*?)</Virtualhost>', d, re.I|re.DOTALL)

#['\n  ServerName blah.com\n  DocumentRoot /var/www/blah.com\n',
# '\n  ServerName blah2.com\n  DocumentRoot /var/www/blah2.com\n']

或者包括以下<VirtualHost>部分:

matches = re.findall(r'<VirtualHost>.*?</Virtualhost>', d, re.I|re.DOTALL)

#['<VirtualHost>\n  ServerName blah.com\n  DocumentRoot /var/www/blah.com\n</Virtualhost>',
# '<VirtualHost>\n  ServerName blah2.com\n  DocumentRoot /var/www/blah2.com\n</Virtualhost>']
于 2012-08-19T01:44:33.473 回答
0

假设您的输入数据是 XML 格式,您可以使用minidom(如 @Aesthete 建议的那样)或ElementTree

import xml.dom.minidom as MD
import xml.etree.ElementTree as ET

input = """
<Document>
    <VirtualHost>
        ServerName blah.com
        DocumentRoot /var/www/blah.com
    </VirtualHost>
    <VirtualHost>
        ServerName blah2.com
        DocumentRoot /var/www/blah2.com
    </VirtualHost>
</Document>"""

domDoc = MD.parseString(input)
etreeDoc = ET.fromstring(input)

# list for Python 3.x
miniDomOutput = list(map(lambda f: f.toxml(), domDoc.getElementsByTagName('VirtualHost')))
elementTreeOutput = list(map(lambda f: ET.tostring(f), etreeDoc.findall('VirtualHost')))

print(miniDomOutput)
print(elementTreeOutput)

输出:

#['<VirtualHost>\n        ServerName blah.com\n        DocumentRoot /var/www/blah.com\n    </VirtualHost>', '<VirtualHost>\n        ServerName blah2.com\n        DocumentRoot /var/www/blah2.com\n    </VirtualHost>']
#[b'<VirtualHost>\n        ServerName blah.com\n        DocumentRoot /var/www/blah.com\n    </VirtualHost>\n    ', b'<VirtualHost>\n        ServerName blah2.com\n        DocumentRoot /var/www/blah2.com\n    </VirtualHost>\n']
于 2012-08-19T15:40:01.080 回答