1

例如,如果我想删除非字母字符,我会这样做:

for (int i = 0; i < s.length; i++) {
    s[i] = s[i].replaceAll("[^a-zA-Z]", "");
}

如何从字符串中完全排除具有非字母字符的单词?

例如: 初始输入:

"a cat jumped jumped; on the table"

它应该排除“跳跃”;因为 ”;”。

输出:

"a cat jumped on the table"
4

4 回答 4

2

Edit: (in response to your edit)

You could do this:

String input = "a cat jumped jumped; on the table";
input = input.replaceAll("(^| )[^ ]*[^A-Za-z ][^ ]*(?=$| )", "");

Let's break down the regex:

  • (^| ) matches after the beginning of a word, either after a space or after the start of the string.
  • [^ ]* matches any sequence, including the null string, of non-spaces (because spaces break the word)
  • [^A-Za-z ] checks if the character is non-alphabetical and does not break the string.
  • Lastly, we need to append [^ ]* to make it match until the end of the word.
  • (?=$| ) matches the end of the word, either the end of the string or the next space character, but it doesn't consume the next space, so that consecutive words will still match (ie "I want to say hello, world! everybody" becomes "I want to say everybody")

Note: if "a cat jumped off the table." should output "a cat jumped off the table", then use this:

input = input.replaceAll(" [^ ]*[^A-Za-z ][^ ]*(?= )", "").replaceAll("[^A-Za-z]$", "");

Assuming you have 1 word per array element, you can do this to replace them with the empty string:

for (String string: s) {
    if (s.matches(".*[^A-Za-z].*") {
        s = "";
    }
}

If you actually want to remove it, consider using an ArrayList:

ArrayList<String> stringList = new ArrayList<>();

for (int index = 0; index < s.length; index++) {
    if (s[index].matches(".*[^A-Za-z].*") {
        stringList.add(s[index]);
    }
}

And the ArrayList will have all the elements that don't have non-alphabetical characters in them.

于 2014-03-19T17:05:27.737 回答
0

您可以对数组中的每个值使用 .toLowerCase() ,然后针对 az 值搜索数组,它会比正则表达式更快。假设您的值位于名为“myArray”的数组中。

List<String> newValues = new ArrayList<>();
for(String s : myArray) {
  if(containsOnlyLetters(s)) {
    newValues.add(s);
  }
}
//do this if you have to go back to an array instead of an ArrayList
String[] newArray = (String[])newValues.toArray();

这是 containsOnlyLetters 方法:

boolean containsOnlyLetters(String input) {
  char[] inputLetters = input.toLowerCase().toCharArray();
  for(char c : inputLetters) {
    if(c < 'a' || c > 'z') {
      return false;
    }
  }
  return true;
}
于 2014-03-19T17:22:55.327 回答
0

尝试这个:

s = s[i].join(" ").replaceAll("\\b\\w*\\W+\\w*(?=\\b)", "").split(" ");

它用空格连接数组,然后应用正则表达式。正则表达式查找一个分词符 ( ),然后是\b一个包含至少一个非单词字符的单词 ( \w*\W+\w*),然后是末尾的分词符(不匹配,仍然会有空格)。将split字符串拆分为一个数组。

于 2014-03-19T16:58:30.937 回答
0
public static void main(String[] args) throws ClassNotFoundException {
    String str[] ={ "123abass;[;[]","abcde","1234"};
    for(String s : str)
    {
        if(s.matches("^[a-zA-Z]+$")) // should start and end with [a-zA-Z]
        System.out.println(s);
    }

O/P : abcde
于 2014-03-19T16:58:59.130 回答