4

我从某些文本中删除短字和长字的代码是:

# Remove Words based on lowerCutOff & upperCutOff
removeByLength<- function(text,lowerCutOff=2,upperCutOff=12){
  text<- gsub("\\b[a-zA-Z0-9]{1,lowerCutOff}\\b|\\b[a-zA-Z0-9]{upperCutOff,}\\b"," ",text)
  return(text)
}

如何在不硬编码上下截止点的情况下实现所需的功能?

4

1 回答 1

4

用于paste连接字符串以创建模式:

removeByLength<- function(text,lowerCutOff=2,upperCutOff=12){
  pattern <- paste("\\b[a-zA-Z0-9]{1,",lowerCutOff,
                 "}\\b|\\b[a-zA-Z0-9]{",upperCutOff,",}\\b", sep="")
  text <- gsub(pattern, " ", text)
  return(text)
}
于 2012-12-10T15:20:15.597 回答