javascript - 查找特定文本并获取完整文本

Question

我正在使用代理来抓取此 url 的数据：CNN 文章

我想获得整篇文章的文本（标题不一定）。所以我尝试了这个：

$(data).find("div:contains('Across the river from Cairo')");

这会找到这段文字，但是当我用它做我的事情时，myThing = $(this).text();它似乎得到的不仅仅是文章。这可能与 HTML 的构造方式有关。如果我查看源代码，我会看到文章文本被限制在p但是将div:containsin 更改为p:contains只会让我获得前几行（显然）

所以我的问题是，无论它是 HTML 结构，我如何获取文章文本。我正在寻找会说的东西（代码）：

find.('Across the river from Cairo') and get this text and all the text underneath this text();

score 2 · Accepted Answer

我使用 selector 从那篇文章中得到了想要的结果p.cnn_storypgraphtxt。要获取整篇文章，您可以使用$("p.cnn_storypgraphtxt").text()或

$("p.cnn_storypgraphtxt").map(function(){return $(this).text;}).get().join("\n");

为了获取遵循某个表达式的文本，您可以使用.last()获取最后选择的节点（即 DOM 中的最低节点），然后.nextAll()喜欢

$(":contains('Across the river from Cairo')").last().nextAll().text()

但这将包含很多不需要的东西。

score 0 · Accepted Answer

尝试使用

$someString = $(data).find("div:contains('Across the river from Cairo')").html();

使用该字符串进行操作或其他任何操作。

javascript - 查找特定文本并获取完整文本

2 回答 2

Related

Reference