0

对于我的一个项目,我需要将段落分成句子。我已经发现您可以使用以下代码将段落分成不同的句子然后打印它们:

BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US);
iterator.setText(content);
int start = iterator.first();
for (int end = iterator.next();
    end != BreakIterator.DONE;
    start = end, end = iterator.next()) {
System.out.println(content.substring(start,end));

其中变量“内容”是预定义变量。

但是,我希望将分解的句子变成字符串,以便我可以继续使用它们。

我该怎么做?我认为这可能与字符串数组有关。谢谢你的帮助。

4

2 回答 2

0

我从来没有使用过BreakIterator,我假设你想要它用于语言环境(仅供参考:这里这里)。无论哪种方式,您都可以将句子保存在数组中,或者List,正如您所提到的。

BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US);
iterator.setText(content);
int start = iterator.first();

List<String> sentences = new ArrayList<String>();
for (int end = iterator.next(); end != BreakIterator.DONE; start = end, end = iterator.next()) {
    //System.out.println(content.substring(start,end));
    sentences.add(content.substring(start,end));
}
于 2014-08-02T17:42:54.050 回答
0

试试我从这个链接得到的

public static void main(String[] args) {
    String content =
            "Line boundary analysis determines where a text " +
            "string can be broken when line-wrapping. The " +
            "mechanism correctly handles punctuation and " +
            "hyphenated words. Actual line breaking needs to " +
            "also consider the available line width and is " +
            "handled by higher-level software. ";

    BreakIterator iterator =
            BreakIterator.getSentenceInstance(Locale.US);

    Arraylist<String> sentences = count(iterator, content);

}

private static Arraylist<String> count(BreakIterator bi, String source) {
    int counter = 0;
    bi.setText(source);

    int lastIndex = bi.first();
    Arraylist<String> contents = new ArrayList<>(); 
    while (lastIndex != BreakIterator.DONE) {
        int firstIndex = lastIndex;
        lastIndex = bi.next();

        if (lastIndex != BreakIterator.DONE) {
            String sentence = source.substring(firstIndex, lastIndex);
            System.out.println("sentence = " + sentence);
            contents.add(sentence);
            counter++;
        }
    }
    return contents;
}
于 2014-08-02T17:45:07.267 回答