4

我写了以下正则表达式:

(https?:\/\/)?([da-z\.-]+)\.([a-z]{2,6})(\/(\w|-)*)*\/?

它的行为可以在这里看到: http://gskinner.com/RegExr/? 34b8m

我编写了以下 JavaScript 代码:

var urlexp = new RegExp(
    '^(https?:\/\/)?([da-z\.-]+)\.([a-z]{2,6})(\/(\w|-)*)*\/?$', 'gi'
);
document.write(urlexp.test("blaaa"))

true即使正则表达式应该不允许单个单词有效,它也会返回。

我究竟做错了什么?

4

1 回答 1

7

您的问题是 JavaScript 将所有转义序列视为字符串的转义。所以你的正则表达式进入内存看起来像这样:

^(https?://)?([da-z.-]+).([a-z]{2,6})(/(w|-)*)*/?$

当您认为是文字句点变成正则表达式通配符时,您可能会注意到这会导致中间出现问题。您可以通过几种方式解决此问题。使用正斜杠正则表达式语法 JavaScript 提供:

var urlexp = /^(https?:\/\/)?([da-z\.-]+)\.([a-z]{2,6})(\/(\w|-)*)*\/?$/gi

Or by escaping your backslashes (and not your forward slashes, as you had been doing - that's exclusively for when you're using /regex/mod notation, just like you don't have to escape your single quotes in a double quoted string and vice versa):

var urlexp = new RegExp('^(https?://)?([da-z.-]+)\\.([a-z]{2,6})(/(\\w|-)*)*/?$', 'gi')

Please note the double backslash before the w - also necessary for matching word characters.

A couple notes on your regular expression itself:

[da-z.-]

d is contained in the a-z range. Unless you meant \d? In that case, the slash is important.

(/(\w|-)*)*/?

My own misgivings about the nested Kleene stars aside, you can whittle that alternation down into a character class, and drop the terminating /? entirely, as a trailing slash will be match by the group as you've given it. I'd rewrite as:

(/[\w-]*)*

Though, maybe you'd just like to catch non space characters?

(/[^/\s]*)*

Anyway, modified this way your regular expression winds up looking more like:

^(https?://)?([\da-z.-]+)\.([a-z]{2,6})(/[\w-]*)*$

Remember, if you're going to use string notation: Double EVERY backslash. If you're going to use native /regex/mod notation (which I highly recommend), escape your forward slashes.

于 2013-03-30T09:11:35.783 回答