-1

How to remove all the punctuation except whitespaces or numbers in Java.

"\\p{Punct}|\\d", "" //THIS WORKS BUT IT REMOVES THE NUMBERS AND I DONT WANT IT TO REMOVE THE NUMBERS...

I am reading text and I need to remove punctuation.

String[] internal;
char ch = 'a';
int counter = 1;
int count;
int c;
Map<String, Set> dictionary = new HashMap<String, Set>();
BufferedReader in = new BufferedReader(new FileReader("yu.txt"));
while (in.ready()) {
    internal = (((in.readLine()).replaceAll("\\p{Punct}|\\d", "")).toLowerCase()).split(" ");//this does not work in my case cause it removes numbers... and makes them whitespaces but other than that this one works I JUST dont want it to remove numbers and keep whitespaces...
    for (count = 0; count < internal.length; count++) {
        if (!dictionary.containsKey(internal[count])) {
            dictionary.put(internal[count], new HashSet());
        }
        if (dictionary.get(internal[count]).size()<10)
        {
        dictionary.get(internal[count]).add(counter);
        }
    }
    counter++;
}
Iterator iterator = dictionary.keySet().iterator();  
while (iterator.hasNext()) {  
String key = iterator.next().toString();  
String value = dictionary.get(key).toString();  
System.out.println(key + ": " + value );  
}  
4

2 回答 2

0

I am unaware of an existing class (default) which can do so.

You will need to write a logic that goes through the String character by character and check if the character is a punctuation. If its is, then cut the String one char before and append the remaining part (effectively removing that char/punctuation).

Prefer using a StringBuilder or StringBuffer instead of directly manipulating the String.

Use the String.substring() method to cut the string.


Else use the String.replace()/String.replaceAll() method to replace all punctuation (you will need to escape certain characters) with "".

于 2012-06-07T06:42:49.447 回答
0

str = str.replaceAll("[^0-9a-zA-Z\s]", "X");

于 2012-06-07T06:41:31.120 回答