regex - 使用正则表达式从文件中提取部分文本

Question

我正在尝试使用以下代码：

x <- scan("myfile.txt", what="", sep="\n")

b <- grep('/^one/(.*?)/^four/', x, ignore.case = TRUE, perl = TRUE, value = TRUE,
     fixed = FALSE, useBytes = FALSE, invert = FALSE)

从 myfile.txt 中提取文本的移植

zero
one
two
three
four
five

我期待的输出是

one
two
three
four

我想包括“一”和“四”我不想放弃它们:)

但不知何故，正则表达式不起作用，控制台输出没有给出错误但也没有文本......？

我正在使用打印（b）

score 2 · Accepted Answer

我不太清楚你在找什么，但只是为了好玩......

R> x
[1] "zero"  "one"   "two"   "three" "four"  "five" 

R> grep("one|four", x) # get the position of "one" and "four"
[1] 2 5

子集x只包括“一”和“四”之间的东西

R> x[do.call(seq, as.list(grep("one|four", x)))]
[1] "one"   "two"   "three" "four"

score 1 · Accepted Answer

gsub('one(.*)four','\\1',paste(x,collapse=''))
[1] "zerotwothreefive"

或在单词之间获取空格：

gsub('one(.*)four','\\1',paste(dat,collapse=' '))
[1] "zero  two three  five"

在 Gsee 评论后编辑：

 gsub('.*(one.*four).*','\\1',paste(dat,collapse=' '))
[1] "one two three four"

但我认为这里不需要使用正则表达式：

 dat[seq(which(dat == 'one'),which(dat == 'four'))]
[1] "one"   "two"   "three" "four"

当然，如果之前的索引顺序不正确，您可以使用 min 。

regex - 使用正则表达式从文件中提取部分文本

2 回答 2

Related

Reference