java - 在数组中拆分 .txt 文件

Question

我想实现一个读取文件（即.txt）并将文件保存在数组中的程序（我已经这样做了）。然后我想要一个二维数组，我只保存每一行的单词。

例如，如果文件包含两行，每行有两个单词，我希望在array[0][0]第一行的第一个单词中包含第一行array[0][1]的第二个单词，等等。

我有以下代码：

for (int i=0; i < aryLines.length; i++) {
    String[] channels = aryLines[i].split(" ");

    System.out.println("line " + (i+1) + ": ");

    for (int j=0; j < channels.length; j++){
        System.out.println("word " + (j+1) + ": ");
        System.out.println(channels[j]);
    }

    System.out.println();
}

其中aryLines包含所有行，但我没有找到执行我描述的解决方案。

score 1 · Accepted Answer

让你的1-D数组是： -

String[] lines = new String[10];

您首先需要声明一个数组数组： -

String[][] words = new String[lines.length][];

然后迭代它，对于每一行，将其拆分并将其分配给内部数组：-

for (int i = 0; i < words.length; i++) {
    words[i] = lines[i].split("\\s+");
}

现在，问题将是，并非所有单词都由 just 分隔space。他们也有许多你需要考虑的标点符号。我会把它留给你在所有的标点符号上分开。

例如： -

"This line: - has word separated by, : and -"

现在，您需要找到句子中使用的所有标点符号。

您可以做的一件事是使用 aRegex仅匹配单词的模式，如果您不确定punctuation行中使用的所有内容。并将每个匹配的单词添加到数组列表中。

"\\w+"  // this regex will match one or more characters forming words

让我们看看它在上面的例子中工作： -

    String str = "This line: - has word separated by, : and -";
    List<String> words = new ArrayList<String>();

    Matcher matcher = Pattern.compile("\\w+").matcher(str);

    while (matcher.find()) {
        words.add(matcher.group());
    }

    System.out.println(words);

输出： -

[This, line, has, word, separated, by, and]

您可以在我发布的上述循环中使用这种方法。

java - 在数组中拆分 .txt 文件

1 回答 1

Related

Reference