1

我将首先发布文本文件中的日期,这只是其中的 4 行,实际文件有几百行长。

2011 年 9 月 9 日星期五
-STV 101--------05:00 - 23:59 SSB 4185 报告于 2011 年 9 月 8 日 2:37 打印

0-AH 104--------07:00 - 23:00 AH GYM 报告打印于 2011 年 9 月 8 日 2:37

-BG 105--------07:00 - 23:00 SH GREAT HALL 报告于 2011 年 9 月 8 日 2:37 打印

我想用这个文本文件做的是忽略上面有日期的第一行,然后忽略下一行的“-”,但读入“STV 101”、“5:00”和“23:59” " 将它们保存到变量中,然后忽略该行上的所有其他字符,之后的每一行都以此类推。

这是我目前完全阅读这些行的方式。然后,一旦用户将路径放入 scheduleTxt JTextfield 中,我就调用此函数。它可以很好地读取和打印每一行。

public void readFile () throws IOException
{
    try
    {
        FileInputStream fstream = new FileInputStream(scheduleTxt.getText());
        DataInputStream in = new DataInputStream(fstream);
        BufferedReader br = new BufferedReader(new InputStreamReader(in));
        String strLine;

        while ((strLine = br.readLine()) != null)   
        {
            System.out.println (strLine);
        }
        in.close();
    }
    catch (Exception e){//Catch exception if any
        System.err.println("Error: " + e.getMessage());
    }
}

更新:事实证明,我还需要从顶行中删除 Friday 并将其放入变量中,谢谢!牛肉。

4

1 回答 1

3

没有彻底测试它,但是这个正则表达式会在第 2、5 和 7 组中捕获您需要的信息:(假设您在“0-AH 104----”的示例中只对“AH 104”感兴趣) ^(\S)*-(([^-])*)(-)+((\S)+)\s-\s((\S)+)\s(.)*

    String regex = "^(\\S)*-(([^-])*)(-)+((\\S)+)\\s-\\s((\\S)+)\\s(.)*";
    Pattern pattern = Pattern.compile(regex);
    while ((strLine = br.readLine()) != null){
        Matcher matcher = pattern.matcher(strLine);
        boolean matchFound = matcher.find();
        if (matchFound){
            String s1 = matcher.group(2);
            String s2 = matcher.group(5);
            String s3 = matcher.group(7);
            System.out.println (s1 + " " + s2 + " " + s3);
        }

    }

可以使用非捕获组调整表达式,以便仅捕获您想要的信息。

正则表达式元素的解释:

  1. ^(\S)*- Matches group of non-whitespace characters ended by -. Note: Could have been ^(.)*- instead, would not work if there are whitespaces before the first -.
  2. (([^-])*) Matches group of every character except -.
  3. (-)+ Matches group of one or more -.
  4. ((\S)+) Matches group of one or more non-white-space characters. This is captured in group 5.
  5. \s-\s Matches group of white-space followed by - followed by whitespace.
  6. '((\S)+)' Same as 4. This is captured in group 7.
  7. \s(.)* Matches white-space followed by anything, which will be skipped.

More info on regular expression can be found on this tutorial. There are also several useful cheatsheets around. When designing/debugging an expression, a regexp testing tool can prove quite useful, too.

于 2011-09-14T16:43:57.260 回答