asp.net - 如何在 C# 2.0 中使用正则表达式处理 < 或 > 类型的字符串的编码

Question

下面是用 C#2.0 代码编写的用于删除不需要的查询字符串（excludeList 中存在的任何内容）的正则表达式将从页面查询字符串中排除，它对我来说工作正常。

string querystring = string.Empty;                       
string excludeList = "cid,incid,h";                        
querystring = Regex.Replace(Regex.Replace(Regex.Replace(HttpContext.Current.Request.Url.Query, @"^\?", "&"), "&(" + excludeList.Replace(",", "|") + ")=[^&]*", "", RegexOptions.IgnoreCase), "^&", "?");

现在我想修改我的常规表达式，以便如果我的 excludeList 包含如下，如果我的页面查询字符串中有任何 < 或 > 将进行编码。

string excludeList = "cid,incid,h,<,>";

例如，如果我的页面查询字符串包含某些内容，则应将其编码为正确的 #343script#545 （示例）

请建议需要对处理编码进行哪些修改。

谢谢。

编辑：

说

HttpContext.Current.Request.Url.Query = "http://localhost:80/faq.aspx?faqid=123&cid=5434&des=dxb&incid=6565&data=<sam>";
string excludeList = "cid,incid,h,<,>";

现在我上面的正则表达式应用于上面的查询字符串变量时，它将呈现如下

string querystring = Regex.Replace(Regex.Replace(Regex.Replace(HttpContext.Current.Request.Url.Query, @"^\?", "&"), "&(" + excludeList.Replace(",", "|") + ")=[^&]*", "", RegexOptions.IgnoreCase), "^&", "?");

querystring = "?faqid=123&des=dxb&data=%3C%20sam%20%3E";

现在一切正常，我想使用上面的正则表达式对“<”和“>”进行编码。

score 1 · Accepted Answer

尝试这个

(?is)^(?<del>[^\?]+?)(?<retain>\?.+)$

解释

@"
(?is)         # Match the remainder of the regex with the options: case insensitive (i); dot matches newline (s)
^             # Assert position at the beginning of the string
(?<del>       # Match the regular expression below and capture its match into backreference with name “del”
   [^\?]         # Match any character that is NOT a ? character
      +?            # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
)
(?<retain>    # Match the regular expression below and capture its match into backreference with name “retain”
   \?            # Match the character “?” literally
   .             # Match any single character
      +             # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
$             # Assert position at the end of the string (or before the line break at the end of the string, if any)
"

更新代码

string resultString = null;
try {
    resultString = Regex.Replace(subjectString, @"(?is)^(?<del>[^?]+?)(?<retain>\?.+)$", "${retain}");
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

asp.net - 如何在 C# 2.0 中使用正则表达式处理 < 或 > 类型的字符串的编码

1 回答 1

Related

Reference