19

我有一个正则表达式来验证一个字符串。但是现在我想删除所有与我的正则表达式不匹配的字符。

例如

regExpression = @"^([\w\'\-\+])"

text = "This is a sample text with some invalid characters -+%&()=?";

//Remove characters that do not match regExp.

result = "This is a sample text with some invalid characters -+";

关于如何使用 RegExpression 确定有效字符并删除所有其他字符的任何想法。

非常感谢

4

3 回答 3

19

我相信您可以在一行中执行此操作(将字符列入白名单并替换其他所有内容):

var result = Regex.Replace(text, @"[^\w\s\-\+]", "");

从技术上讲,它会产生这样的结果:“这是一个带有一些无效字符的示例文本 - +”,这与您的示例略有不同( - 和 + 之间的额外空格)。

于 2011-05-27T15:40:34.617 回答
15

就那么简单:

var match = Regex.Match(text, regExpression);
string result = "";
if(match.Success)
    result = match.Value;

删除不匹配的字符与保留匹配的字符相同。这就是我们在这里所做的。

如果表达式可能在您的文本中多次匹配,您可以使用:

var result = Regex.Matches(text, regExpression).Cast<Match>()
                  .Aggregate("", (s, e) => s + e.Value, s => s);
于 2011-05-27T15:29:57.540 回答
3

感谢Replace chars if not match answer 我创建了一个辅助方法来去除未接受的字符

允许的模式应该是正则表达式格式,期望它们用方括号括起来。一个函数将在打开方括号后插入一个波浪号。我预计它不适用于所有描述有效字符集的 RegEx,但它适用于我们正在使用的相对简单的字符集。

 /// <summary>
               /// Replaces  not expected characters.
               /// </summary>
               /// <param name="text"> The text.</param>
               /// <param name="allowedPattern"> The allowed pattern in Regex format, expect them wrapped in brackets</param>
               /// <param name="replacement"> The replacement.</param>
               /// <returns></returns>
               /// //        https://stackoverflow.com/questions/4460290/replace-chars-if-not-match.
               //https://stackoverflow.com/questions/6154426/replace-remove-characters-that-do-not-match-the-regular-expression-net
               //[^ ] at the start of a character class negates it - it matches characters not in the class.
               //Replace/Remove characters that do not match the Regular Expression
               static public string ReplaceNotExpectedCharacters( this string text, string allowedPattern,string replacement )
              {
                     allowedPattern = allowedPattern.StripBrackets( "[", "]" );
                      //[^ ] at the start of a character class negates it - it matches characters not in the class.
                      var result = Regex .Replace(text, @"[^" + allowedPattern + "]", replacement);
                      return result;
              }

static public string RemoveNonAlphanumericCharacters( this string text)
              {
                      var result = text.ReplaceNotExpectedCharacters(NonAlphaNumericCharacters, "" );
                      return result;
              }
        public const string NonAlphaNumericCharacters = "[a-zA-Z0-9]";

我的 StringHelper 类 http://geekswithblogs.net/mnf/archive/2006/07/13/84942.aspx中有几个函数在这里使用。

           /// <summary>
           /// ‘StripBrackets checks that starts from sStart and ends with sEnd (case sensitive).
           ///           ‘If yes, than removes sStart and sEnd.
           ///           ‘Otherwise returns full string unchanges
           ///           ‘See also MidBetween
           /// </summary>

           public static string StripBrackets( this string str, string sStart, string sEnd)
          {
                  if (CheckBrackets(str, sStart, sEnd))
                 {
                       str = str.Substring(sStart.Length, (str.Length – sStart.Length) – sEnd.Length);
                 }
                  return str;
          }
           public static bool CheckBrackets( string str, string sStart, string sEnd)
          {
                  bool flag1 = (str != null ) && (str.StartsWith(sStart) && str.EndsWith(sEnd));
                  return flag1;
          }
于 2012-10-28T02:50:01.173 回答