21
4

6 回答 6

29

You can make use of Pattern and Matcher class here. You can put all the filtered character in a character class, and use Matcher#find() method to check whether your pattern is available in string or not.

You can do it like this: -

public boolean containsIllegals(String toExamine) {
    Pattern pattern = Pattern.compile("[~#@*+%{}<>\\[\\]|\"\\_^]");
    Matcher matcher = pattern.matcher(toExamine);
    return matcher.find();
}

find() method will return true, if the given pattern is found in the string, even once.


Another way that has not yet been pointed out is using String#split(regex). We can split the string on the given pattern, and check the length of the array. If length is 1, then the pattern was not in the string.

public boolean containsIllegals(String toExamine) {
    String[] arr = toExamine.split("[~#@*+%{}<>\\[\\]|\"\\_^]", 2);
    return arr.length > 1;
}

If arr.length > 1, that means the string contained one of the character in the pattern, that is why it was splitted. I have passed limit = 2 as second parameter to split, because we are ok with just single split.

于 2013-01-31T21:27:33.267 回答
13

I need the method to scan every character in the string

If you must do it character-by-character, regexp is probably not a good way to go. However, since all characters on your "blacklist" have codes less than 128, you can do it with a small boolean array:

static final boolean blacklist[] = new boolean[128];

static {
    // Unassigned elements of the array are set to false
    blacklist[(int)'~'] = true;
    blacklist[(int)'#'] = true;
    blacklist[(int)'@'] = true;
    blacklist[(int)'*'] = true;
    blacklist[(int)'+'] = true;
    ...
}

static isBad(char ch) {
    return (ch < 128) && blacklist[(int)ch];
}
于 2013-01-31T21:27:25.413 回答
10

Use a constant for avoids recompile the regex in every validation.

private static final Pattern INVALID_CHARS_PATTERN = 
                               Pattern.compile("^.*[~#@*+%{}<>\\[\\]|\"\\_].*$");

And change your code to:

public boolean containsIllegals(String toExamine) {
    return INVALID_CHARS_PATTERN.matcher(toExamine).matches();
}

This is the most efficient way with Regex.

于 2013-01-31T21:49:12.463 回答
8

If you can't use a matcher, then you can do something like this, which is cleaner than a bunch of different if statements or a byte array.

 for(int i = 0; i < toExamine.length(); i++) {
    char c = toExamine.charAt(i);
    if("~#@*+%{}<>[]|\"_^".contains(c)){
         return true;
    }
 }
于 2013-01-31T21:30:19.637 回答
5

Try the negation of a character class containing all the blacklisted characters:

public boolean containsIllegals(String toExamine) {
    return toExamine.matches("[^~#@*+%{}<>\\[\\]|\"\\_^]*");
}

This will return true if the string contains illegals (your original function seemed to return false in that case).

The caret ^ just to the right of the opening bracket [ negates the character class. Note that in String.matches() you don't need the anchors ^ and $ because it automatically matches the whole string.

于 2013-01-31T21:30:48.870 回答
2

A pretty compact way of doing this would be to rely on the String.replaceAll method:

public boolean containsIllegal(final String toExamine) {
    return toExamine.length() != toExamine.replaceAll(
            "[~#@*+%{}<>\\[\\]|\"\\_^]", "").length();
}
于 2013-01-31T21:47:31.647 回答