1

http://www.unc.edu/academics/

我尝试ul使用 Jsoup 从上面的页面中获取所有呈现的 html 列表。

这是我的代码:

Document doc = Jsoup.connect("http://www.unc.edu/academics/").get();
Elements lists = doc.select("ul");
for (Element list: lists) {
    Elements li = list.select("li a");
    if (li.size() > 0) {
        ArrayList<String> anchors = new ArrayList<String>();
        for (Element e : li) {
            anchors.add(e.text());
        }
        System.out.println(anchors);
    }
 }

以下是输出:

[Calendar, Libraries, Maps, Departments, MyUNC]
[About UNC, Academics, Research, Public Service, Health Care, UNC Global, Arts, Athletics]
[Academic Departments, Continuing Education, Distance Education, Provost, Services and Resources]
[Academic Calendar, Courses, Libraries, Registrar, Sakai]
[College of Arts & Sciences, Dentistry, Education, Eshelman School of Pharmacy, Friday Center for Continuing Education, General College, Gillings School of Global Public Health, Graduate School, Kenan-Flagler Business School, Government, Information & Library Science, Journalism & Mass Communication, Law, Medicine, Nursing, Social Work, Summer School]
[Departments A-Z, Departments by Interest Area]
[American Indian Studies, APPLES Service-Learning, Applied Sciences & Engineering, Archaeology, Bioinformatics & Computational Biology Training, Biological & Biomedical Sciences, Burch Fellows, Business (Undergraduate), Carolina Entrepreneurial Initiative, Christianity & Culture, Cinema, Cognitive Science, Comparative Literature, Communication Studies, Creative Writing, Cultural Studies, Developmental Biology Training, Ethnicity, Culture & Health Outcomes, Environment & Ecology, European Studies, First Year Seminars, Folklore, Genetics & Molecular Biology, Global Studies, Honors, Humanities & Human Values, Institute for Environment, Jewish Studies, Johnston Center for Undergraduate Excellence, Languages Across Curriculum, Latin American Studies, Latina/o Studies, Management & Society, Mathematical Decision Sciences, Mathematical Sciences, Medieval & Early Modern Studies, Middle East/Muslim Civilizations, Molecular Biology & Biotechnology, Molecular/Cellular Biophysics, Morehead-Cain Scholarship, Neurobiology, Peace, War & Defense, Philosophy, Politics & Economics, Program on Health Outcomes, Public Administration, Public Health Leadership, Russian/East European Studies, Robertson Scholars, Sexuality Studies, Social & Economic Justice, SPIRE Postdoctoral Program, Stone Center, Study Abroad, SURE, Toxicology, Transatlantic Master’s Program, Undergraduate Curricula, World View, Writing for Screen & Stage]
[Alert Carolina, Contact, Departments, Directory, Employment, FAQs, ITS, Privacy Policy, Accessibility, RSS Feeds]

您可能会注意到下图中显示的三个列表正在合并为一个,即输出中的第五个列表。

在此处输入图像描述

正如您在页面源代码中看到的那样,这三个列表确实是由三个ul标签呈现的。它可能与页面中嵌入的 Javascript 或 CSS 有关吗?

4

2 回答 2

4

源代码确实提供了一份清单。

<ul class="col3">
<li><a href="http://artsandsci.unc.edu/">College of Arts &amp; Sciences</a></li>
<li><a href="http://www.dentistry.unc.edu/">Dentistry</a></li>
<li><a href="http://soe.unc.edu/">Education</a></li>
<li><a href="http://www.pharmacy.unc.edu/">Eshelman School of Pharmacy</a></li>
<li><a href="http://www.fridaycenter.unc.edu/">Friday Center for Continuing Education</a></li>
<li><a href="http://advising.unc.edu/">General College</a></li>
<li><a href="http://www.sph.unc.edu/">Gillings School of Global Public Health</a></li>
<li><a href="http://gradschool.unc.edu/">Graduate School</a></li>
<li><a href="http://www.kenan-flagler.unc.edu/">Kenan-Flagler Business School</a></li>
<li><a href="http://www.sog.unc.edu/">Government</a></li>
<li><a href="http://sils.unc.edu/">Information &amp; Library Science</a></li>
<li><a href="http://www.jomc.unc.edu/">Journalism &amp; Mass Communication</a></li>
<li><a href="http://www.law.unc.edu/">Law</a></li>
<li><a href="http://www.med.unc.edu/">Medicine</a></li>
<li><a href="http://nursing.unc.edu/">Nursing</a></li>
<li><a href="http://ssw.unc.edu/">Social Work</a></li>
<li><a href="http://summer.unc.edu/">Summer School</a></li>
</ul>

但是 javascript 把它分成了三个独立<ul>的 s。

jQuery(document).ready(function($) {
    $('div.accordion > ul').makeacolumnlists({
        cols: 3,
        colWidth: '33%',
        equalHeight: false,
        startN: 1
    });
    $('div.accordion > div > ul').accordion({
        autoHeight: false,
        header:'> li > h4',
        collapsible: true,
        active: false
    });
    $('ul.col2').makeacolumnlists({
        cols: 2,
        colWidth: 0,
        equalHeight: false,
        startN: 1
    });
    $('ul.col3').makeacolumnlists({
        cols: 3,
        colWidth: 0,
        equalHeight: false,
        startN: 1
    });
});

成功了。

于 2012-11-30T17:04:49.670 回答
1

嗯....它们不是三个列表,而是一个列表。这是实际的页面代码。如您所见,它只有 1 个<ul>标签。它使用 CSS 使其显示为 3 列 ( class="col3")

我假设如果 Chrome 给你的信息不正确,那可能是 Javascript 把你弄乱了。

<ul class="col3">
<li><a href="http://artsandsci.unc.edu/">College of Arts &amp; Sciences</a></li>
<li><a href="http://www.dentistry.unc.edu/">Dentistry</a></li>
<li><a href="http://soe.unc.edu/">Education</a></li>
<li><a href="http://www.pharmacy.unc.edu/">Eshelman School of Pharmacy</a></li>
<li><a href="http://www.fridaycenter.unc.edu/">Friday Center for Continuing Education</a></li>
<li><a href="http://advising.unc.edu/">General College</a></li>
<li><a href="http://www.sph.unc.edu/">Gillings School of Global Public Health</a></li>
<li><a href="http://gradschool.unc.edu/">Graduate School</a></li>
<li><a href="http://www.kenan-flagler.unc.edu/">Kenan-Flagler Business School</a></li>
<li><a href="http://www.sog.unc.edu/">Government</a></li>
<li><a href="http://sils.unc.edu/">Information &amp; Library Science</a></li>
<li><a href="http://www.jomc.unc.edu/">Journalism &amp; Mass Communication</a></li>
<li><a href="http://www.law.unc.edu/">Law</a></li>
<li><a href="http://www.med.unc.edu/">Medicine</a></li>
<li><a href="http://nursing.unc.edu/">Nursing</a></li>
<li><a href="http://ssw.unc.edu/">Social Work</a></li>
<li><a href="http://summer.unc.edu/">Summer School</a></li>
</ul>
于 2012-11-30T16:58:11.373 回答