c# - Regex Issue in C#

Question

I am trying to create a C# routine that removes all of the following prefixes and suffixes and returns just the root word of a domain:

var stripChars = new List<string> { "http://", "https://", "www.", "ftp.", ".com",  ".net", ".org", ".info", ".co", ".me", ".mobi", ".us", ".biz" };

I do this with the following code:

originalDomain = stripChars.Aggregate(originalDomain, (current, repl) => Regex.Replace(current, repl, @"", RegexOptions.IgnoreCase));

Which seems to work in almost all cases. Today, however, I discovered that setting "originalDomain" to "NameCheap.com" does not return:

NameCheap

Like it should, but rather:

NCheap

Can anyone look at this and tell me what is going wrong? Any help would be appreciated.

score 13 · Accepted Answer

这是正常的：正则表达式中的点表示任何字符。

因此，.me匹配ame.NameCheap

用反斜杠转义点。

此外，您最好使用专用的 URI API 进行此类操作。

score 3 · Accepted Answer

我知道这并不能直接回答您的问题，但是鉴于您要完成的具体任务，我建议您尝试以下方法：

Uri uri = new Uri(originalDomain);
originalDomain = uri.Host;

编辑：

如果您的输入可能不包含方案，您可以使用本文中所述的uri 构建器

var hostName = new UriBuilder(input).Host

希望这可以帮助。

2 回答 2