ruby - 在替换中修改正则表达式匹配

Question

我正在尝试将文本文件中的某些字符串与正则表达式匹配，然后修改找到该模式的所有位置。这就像搜索和替换，但我试图用找到的内容的修改版本替换（我确信它有一个名字，但我对它还不够熟悉）。

所以我正在寻找匹配的字符串[a-z]_[a-z]（例如，some_string），并且我想通过删除下划线并将第二个小写单词大写来替换它，基本上是驼峰式大小写（someString）。

关于如何做到这一点的任何指示（棘手的部分是我真的不知道如何甚至谷歌为此）。

编辑

我试图稍微简化问题以使其更通用，但我也试图仅在匹配没有出现在引号中的情况下这样做。也就是说，我不想匹配引号中的下划线（所以，这里没有匹配："this_is_a_string"...应该保持原样）。当我第一次发这篇文章时，我可能应该包括这个。

score 4 · Accepted Answer

您可以使用带有gsub的回调函数，例如：

"some_thing_good".gsub(/_([a-z])/) {|m| m[1].upcase}

为了避免双引号内的字符串，您可以这样做：

"\"look_at_me\" some_thing_good".gsub(/"[^"]+"|_[a-z]/) {|m| (m.length>2)? m : m[1].upcase }

这个想法是在之前匹配它们并自行替换它们。如果我测试匹配长度，我会立即知道替换的哪一部分已匹配，因为第二部分仅包含 2 个字符，而第一部分至少包含 3 个字符。

score 1 · Accepted Answer

我认为更好的方法是使用括号将您感兴趣的模式括起来。

In your case, I would use the following regular expression:

string.gsub(/(?<=[a-z])_([a-z]+)/) {|s| "#{s[1].upcase}#{s[2..-1]}"}

This regexp can be read in two parts, the first ask for string that starts with valid char and the second is followed by "_" and a sequence of valid chars.

Inside the block code, you can use Regexp.last_match and will return the MatchData where you can access each pattern inside the parentheses, ex:

string.gsub(/(?<=[a-z])_([a-z]+)/) do |s| 
  p Regexp.last_match.to_a # this will print all sub-patterns found
  "#{s[1].upcase}#{s[2..-1]}" # return formatted string
end

As you mentioned, you are not interesting in patterns inside quotes. I would use a regular expression inside other. The first one to remove quoted string and second one to search for patterns:

string.scan(/(\"[^\"]+\"|([^\"]+))/) do |s|
  next s[0] unless s[1] # skip quoted data
  # replace snake case to camel case
  s[1].gsub(/(?<=[a-z])_([a-z]+)/) {|s| "#{s[1].upcase}#{s[2..-1]}"}
end

ruby - 在替换中修改正则表达式匹配

2 回答 2

Related

Reference