我有一个读取 rdf n-triples 格式的文件。但是,我不允许使用第三方 API(如 jena 等......这是一个不同的辩论)。
但基本上,我可以得到两种字符串:
<foo 1> <bar 1> <foo bar> .
<foo 2> <bar 2> foobar .
所以,我想写一个类:
void ParseTriples(String s){
setObject(<foo> part)
setPredicate(<bar part>)
setObject(<foobar> or foobar)
}
我所拥有的是一个黑客..
public void setNTriples(String text){
Pattern pattern = Pattern.compile("<(.*?)>");
//Pattern pattern = Pattern.compile("<([^>]*)>\\s+[<]?([^>]*)[>]?");
//Pattern pattern = Pattern.compile("(<[a-zA-Z.\\d\\s]+>|\\w+)");
Matcher matcher = pattern.matcher(text);
int count = 0;
int end = 0;
int totalLength = text.length();
while(matcher.find()) {
if (count == 0){
//System.out.println(matcher.group(1));
setSubject(new Text(matcher.group(1)));
//length += getSubject().toString().length();
//System.out.println(length);
count +=1;
}
else if (count == 1){
setPredicate(new Text(matcher.group(1)));
count +=1;
end = matcher.end();
}
else if (count == 2){
//System.out.println(matcher.group(1));
setObject(new Text(matcher.group(1)));
count +=1;
//System.out.println(text.substring(length+5, totalLength));
}
}
//System.out.println(count);
// ugly hack
if (count == 2){
setObject(new Text(text.substring(end+1,totalLength-2)));
}
}
我该如何解决?