0

我在 HTML 页面中有这样的内容:

<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>

如何获取所有具有data-name-en属性的元素?

4

2 回答 2

0

我找到了正确答案:

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

html = PyQuery(s)
items = html.find('li span[data-name-en]')

为了获取属性值,您需要这样做:

pq(item).attr("data-name-en")
于 2018-11-13T20:52:49.417 回答
0
from bs4 import BeautifulSoup as bs

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

soup = bs(s, 'xml')
result = [x['data-name-en'] for x in soup('span') if x.has_attr('data-name-en')]

print(result)
于 2018-10-17T09:25:30.253 回答