0

I am trying to create a C# routine that removes all of the following prefixes and suffixes and returns just the root word of a domain:

var stripChars = new List<string> { "http://", "https://", "www.", "ftp.", ".com",  ".net", ".org", ".info", ".co", ".me", ".mobi", ".us", ".biz" };

I do this with the following code:

originalDomain = stripChars.Aggregate(originalDomain, (current, repl) => Regex.Replace(current, repl, @"", RegexOptions.IgnoreCase));

Which seems to work in almost all cases. Today, however, I discovered that setting "originalDomain" to "NameCheap.com" does not return:

NameCheap

Like it should, but rather:

NCheap

Can anyone look at this and tell me what is going wrong? Any help would be appreciated.

4

2 回答 2

13

这是正常的:正则表达式中的点表示任何字符。

因此,.me匹配ame.NameCheap

用反斜杠转义点。

此外,您最好使用专用的 URI API 进行此类操作。

于 2013-05-28T14:26:25.770 回答
3

我知道这并不能直接回答您的问题,但是鉴于您要完成的具体任务,我建议您尝试以下方法:

Uri uri = new Uri(originalDomain);
originalDomain = uri.Host;

编辑:

如果您的输入可能不包含方案,您可以使用本文中所述的uri 构建器

var hostName = new UriBuilder(input).Host

希望这可以帮助。

于 2013-05-28T14:29:35.847 回答