1

It is a beginner question. The platform and number of classes in Foundation Class is simply overwhelming to comb through, so I hope the SO community has a ready answer to offer.

This is my use case:

I want to read in a html file and extract all the text in p tag.

I do not need to display the html markup. But if Webkit has a solution I am happy to use it.

In python world, the answer will be Beautiful Soup. I am looking for OSX foundation kit equivalent or whatever classes that may achieve the goal.

4

1 回答 1

3

您可以使用NSXMLDocument并传入NSXMLDocumentTidyXML作为mask选项之一。
这将允许NSXMLDocument解析非 XHTML 文档(如果它们不是完全错误的)。

要获取所有 p 元素的节点列表,可以在NSXMLDocument实例上使用以下 XPath 表达式:
NSArray* pNodes = [projectDocument nodesForXPath:@"//*/@p" error:nil];

要获取 p 节点的文本内容,请使用 stringValue 属性。

于 2013-05-29T10:55:20.833 回答