regex - R中函数之间的简单解析？

Question

假设我想提取在两个已定义字符串之间找到的字符串。例如，我们将调用它的函数parse_between()在 R 中的工作方式如下：

>main_string<-"the quick brown fox>$ jumps over the lazy </ dog"
>substring<-parse_between(main_string, begin=">$", end="</")
>substring
[1] " jumps over the lazy "

如果它可以生成一个包含与每个实例对应的元素的向量，那就更好了。我搜索了一些可用于字符串操作的包，如“stringr”，但没有找到像示例所示那样轻松执行此操作的函数。不幸的是，尽管搜索了我还没有找到 R 的 html 解析器，但我的动机是解析 html 文件。

score 2 · Accepted Answer

首先，请仔细阅读这个问题和答案： RegEx match open tags except XHTML self-contained tags

然后，如果仍然没有被阻止，请使用regexor gsub，这两者都有元字符来指定行的开头或结尾。那你能做的就是更换

{start_of_line through to ">$"}

什么都没有，然后替换

{"</" through to end_of_line}

一无所有。

regex - R中函数之间的简单解析？

1 回答 1

Related

Reference