17

我正在尝试创建一个正则表达式来验证给定的字符串是否只有字母字符 az 或 AZ。字符串最长可达 25 个字母。(我不确定正则表达式是否可以检查字符串的长度)

示例:
1. "abcdef" = true;
2. "a2bdef" = false ;
3. "333" = false;
4. "j" = true;
5. "aaaaaaaaaaaaaaaaaaaaaaaaaa" = false; //26 个字母

这是我到目前为止所拥有的......虽然无法弄清楚它有什么问题

Regex alphaPattern = new Regex("[^a-z]|[^A-Z]");

我认为这意味着该字符串只能包含来自 az 的大写或小写字母,但是当我将它与所有字母的字符串匹配时,它会返回 false ...

此外,任何关于使用正则表达式与其他验证方法的效率的建议将不胜感激。

4

6 回答 6

38
Regex lettersOnly = new Regex("^[a-zA-Z]{1,25}$");
  • ^意思是“从字符串的开头开始匹配”
  • [a-zA-Z]意思是“匹配小写和大写字母 az”
  • {1,25}意思是“匹配上一项(字符类,见上)1到25次”
  • $表示“仅当光标位于字符串末尾时才匹配”
于 2009-06-13T09:15:10.250 回答
19

I'm trying to create a regex to verify that a given string only has alpha characters a-z or A-Z.

Easily done as many of the others have indicated using what are known as "character classes". Essentially, these allow us to specifiy a range of values to use for matching: (NOTE: for simplification, I am assuming implict ^ and $ anchors which are explained later in this post)

[a-z] Match any single lower-case letter.
ex: a matches, 8 doesn't match

[A-Z] Match any single upper-case letter.
ex: A matches, a doesn't match

[0-9] Match any single digit zero to nine
ex: 8 matches, a doesn't match

[aeiou] Match only on a or e or i or o or u. ex: o matches, z doesn't match

[a-zA-Z] Match any single lower-case OR upper-case letter. ex: A matches, a matches, 3 doesn't match

These can, naturally, be negated as well: [^a-z] Match anything that is NOT an lower-case letter ex: 5 matches, A matches, a doesn't match

[^A-Z] Match anything that is NOT an upper-case letter ex: 5 matches, A doesn't matche, a matches

[^0-9] Match anything that is NOT a number ex: 5 doesn't match, A matches, a matches

[^Aa69] Match anything as long as it is not A or a or 6 or 9 ex: 5 matches, A doesn't match, a doesn't match, 3 matches

To see some common character classes, go to: http://www.regular-expressions.info/reference.html

The string can be up to 25 letters long. (I'm not sure if regex can check length of strings)

You can absolutely check "length" but not in the way you might imagine. We measure repetition, NOT length strictly speaking using {}:

a{2} Match two a's together.
ex: a doesn't match, aa matches, aca doesn't match

4{3} Match three 4's together. ex: 4 doesn't match, 44 doesn't match, 444 matches, 4434 doesn't match

Repetition has values we can set to have lower and upper limits:

a{2,} Match on two or more a's together. ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa matches

a{2,5} Match on two to five a's together. ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa doesn't match

Repetition extends to character classes, so: [a-z]{5} Match any five lower-case characters together. ex: bubba matches, Bubba doesn't match, BUBBA doesn't match, asdjo matches

[A-Z]{2,5} Match two to five upper-case characters together. ex: bubba doesn't match, Bubba doesn't match, BUBBA matches, BUBBETTE doesn't match

[0-9]{4,8} Match four to eight numbers together. ex: bubba doesn't match, 15835 matches, 44 doesn't match, 3456876353456 doesn't match

[a3g]{2} Match an a OR 3 OR g if they show up twice together. ex: aa matches, ba doesn't match, 33 matches, 38 doesn't match, a3 DOESN'T match

Now let's look at your regex: [^a-z]|[^A-Z] Translation: Match anything as long as it is NOT a lowercase letter OR an upper-case letter.

To fix it so it meets your needs, we would rewrite it like this: Step 1: Remove the negation [a-z]|[A-Z] Translation: Find any lowercase letter OR uppercase letter.

Step 2: While not stricly needed, let's clean up the OR logic a bit [a-zA-Z] Translation: Find any lowercase letter OR uppercase letter. Same as above but now using only a single set of [].

Step 3: Now let's indicate "length" [a-zA-Z]{1,25} Translation: Find any lowercase letter OR uppercase letter repeated one to twenty-five times.

This is where things get funky. You might think you were done here and you may well be depending on the technology you are using.

Strictly speaking the regex [a-zA-Z]{1,25} will match one to twenty-five upper or lower-case letters ANYWHERE on a line:

[a-zA-Z]{1,25} a matches, aZgD matches, BUBBA matches, 243242hello242552 MATCHES

In fact, every example I have given so far will do the same. If that is what you want then you are in good shape but based on your question, I'm guessing you ONLY want one to twenty-five upper or lower-case letters on the entire line. For that we turn to anchors. Anchors allow us to specify those pesky details:

^ beginning of a line
(I know, we just used this for negation earlier, don't get me started)

$ end of a line

We can use them like this:

^a{3} From the beginning of the line match a three times together ex: aaa matches, 123aaa doesn't match, aaa123 matches

a{3}$ Match a three times together at the end of a line ex: aaa matches, 123aaa matches, aaa123 doesn't match

^a{3}$ Match a three times together for the ENTIRE line ex: aaa matches, 123aaa doesn't match, aaa123 doesn't match

Notice that aaa matches in all cases because it has three a's at the beginning and end of the line technically speaking.

So the final, technically correct solution, for finding a "word" that is "up to five characters long" on a line would be:

^[a-zA-Z]{1,25}$

The funky part is that some technologies implicitly put anchors in the regex for you and some don't. You just have to test your regex or read the docs to see if you have implicit anchors.

于 2009-06-14T10:48:18.553 回答
7
/// <summary>
/// Checks if string contains only letters a-z and A-Z and should not be more than 25 characters in length
/// </summary>
/// <param name="value">String to be matched</param>
/// <returns>True if matches, false otherwise</returns>
public static bool IsValidString(string value)
{
    string pattern = @"^[a-zA-Z]{1,25}$";
    return Regex.IsMatch(value, pattern);
}
于 2009-06-15T08:14:55.100 回答
6

字符串最长可达 25 个字母。(我不确定正则表达式是否可以检查字符串的长度)

正则表达式当然可以检查字符串的长度 - 从其他人发布的答案中可以看出。

但是,当您验证用户输入(例如用户名)时,我建议您单独进行检查。

问题是,正则表达式只能告诉你字符串是否匹配。它不会说明为什么它不匹配。文本是否太长或是否包含不允许的字符 - 你无法判断。当程序说:“提供的用户名包含无效字符或太长”时,这远非友好。相反,您应该为不同的情况提供单独的错误消息。

于 2009-06-13T09:42:12.463 回答
4

您使用的正则表达式是[^a-z]and的交替[^A-Z]。并且表达式[^…]意味着匹配字符集中描述的字符以外的任何字符。

a-z因此,总体而言,您的表达式意味着匹配除或之外的任何单个字符A-Z

但是您宁愿需要一个仅匹配的正则表达式a-zA-Z

[a-zA-Z]

并指定它的长度,用字符串的开始(^)和结束($)锚定表达式,并用量词描述长度,至少但不超过重复:{n,m}nm

^[a-zA-Z]{0,25}$
于 2009-06-13T09:28:24.057 回答
1

我是否正确理解它只能包含大写小写字母?

new Regex("^([a-z]{1,25}|[A-Z]{1,25})$")

在这种情况下,正则表达式似乎是正确的选择。

顺便说一句,字符类中第一个位置的插入符号(“^”)表示“不是”,所以你的“ [^a-z]|[^A-Z]”表示“不是任何小写字母,也不是任何大写字母”(忽略 az 不是所有字母)。

于 2009-06-13T09:17:14.553 回答