我如何通过 x-ray/nodejs抓取黑客新闻( https://news.ycombinator.com/ )?
我想从中得到这样的东西:
[
{title1, comment1},
{title2, comment2},
...
{"‘Minimal’ cell raises stakes in race to harness synthetic life", 48}
...
{title 30, comment 30}
]
有一个新闻表,但我不知道如何抓取它...网站上的每个故事都由三列组成。这些没有他们独有的父母。所以结构看起来像这样
<tbody>
<tr class="spacer"> //Markup 1
<tr class="athing"> //Headline 1 ('.deadmark+ a' contains title)
<tr class> //Meta Information 1 (.age+ a contains comments)
<tr class="spacer"> //Markup 2
<tr class="athing"> //Headline 2 ('.deadmark+ a' contains title)
<tr class> //Meta Information 2 (.age+ a contains comments)
...
<tr class="spacer"> //Markup 30
<tr class="athing"> //Headline 30 ('.deadmark+ a' contains title)
<tr class> //Meta Information 30 (.age+ a contains comments)
到目前为止,我已经尝试过:
x("https://news.ycombinator.com/", "tr", [{
title: [".deadmark+ a"],
comments: ".age+ a"
}])
和
x("https://news.ycombinator.com/", {
title: [".deadmark+ a"],
comments: [".age+ a"]
})
第二种方法返回 30 个名称和 29 个评论点...我看不到将它们映射在一起的任何可能性,因为没有信息 30 个标题中的哪一个缺少评论...
任何帮助