2

I have case where I need to split a string in Java with various escape characters. The format will be something like:

id:"description",id:"description",....

id: numeric (int)
description: String escaped with EscapeUtils.escapeJava(input), it could contain any readable characters, including :, , and even " which will be escaped to \".

So, the String.split method wouldn't seem appropiate as it could get issues with descriptions with , or :. I know I can write some algorithm that will work fine, it is even a nice excersice to do Test Driven Development, but I was wondering if there's some lazy way around it and use some kind of parser that can do this kind of stuff?

My other possible approach is to generate a JSONArray and don't mess with complexity I'm not interested in, but it will requiere one more library dependency which I'm not convinced of incluidng in this module...

So, what I'm asking for is ideas on how this kind of problem can be solved (libraries, with the Java API, etc.).

4

1 回答 1

3

听起来你的字符串应该匹配这个正则表达式:

^(\d+:"([^"\\]|\\.)*"(,(?!$)|$))+$

Map<Integer, String>在这种情况下,您可以通过编写如下内容将部分提取到 a中:

private static final Pattern TOTAL_STRING_PATTERN =
    Pattern.compile("^(\\d+:\"([^\"\\\\]|\\\\.)*\"(,(?!$)|$))+$");
private static final Pattern PARTIAL_STRING_PATTERN =
    Pattern.compile("(\\d+):\"((?:[^\"\\\\]|\\\\.)*)\"");

public Map<Integer, String> parse(final String input) {
    if(! TOTAL_STRING_PATTERN.matcher(input).matches()) {
        throw new IllegalArgumentException();
    }
    final Map<Integer, String> ret = new HashMap<Integer, String>();
    final Matcher m = PARTIAL_STRING_PATTERN.matcher(input);
    while(m.find()) {
        final Integer id = Integer.valueOf(m.group(1));
        final String description = StringEscapeUtils.unescapeJava(m.group(2));
        ret.put(id, description);
    }
    return Collections.unmodifiableMap(ret);
}

(您可能还需要检查标识符是否在范围之外的int情况一些尊重,例如,允许冒号和逗号周围的空格。但以上应该是一个好的开始。)

于 2013-07-07T17:54:20.723 回答