1

我有以下两篇文章。

1)v1.0 - 80 s200 + 2013-10-17T05:59:59-0700 1TZY6R5HERP7SJRRYDYV 69.71.202.109 7802 41587 495307 30595 HTTP/1.1 POST /gp/ppd

2)access-1080.2013-10-17-05.us-online-cpp-portlet-live-1d-i-752c3b12.us-east-1.phnew.com.gz

我需要从他们那里获取这些数据 从我需要的第一个正则表达式中:- 。1TZY6R5HERP7SJRRYDYV让我们称之为accessId。这总是由 20 个字符组成,是 0-9 的数字和大写字母 [AZ] 的组合

我尝试使用[A-Z0-9]{20}没有运气。

Pattern p = Pattern.compile([A-Z0-9]{20});  
Matcher m = p.matcher(myString);

我也在寻找一个与模式匹配的java API,如果它匹配,就会给我模式作为结果

从我需要的第二个开始us-online-cpp-portlet-live-1d-i-752c3b12.us-east-1.phnew.com。我很难破解这个。

任何帮助都会很有用。

4

2 回答 2

3

您需要调用Matcher#find()后跟Matcher#group()以获得匹配的结果:

Pattern p = Pattern.compile("[A-Z0-9]{20}");
Matcher m = p.matcher(myString);
String accessId = null;
if (m.find())
   accessId = m.group();
于 2013-10-17T13:48:40.520 回答
2

您的代码存在一些问题 - 例如Pattern初始化中缺少双引号。

这是您正在寻找的示例:

// text for 1st pattern
String text1 = "v1.0 - 80 s200 + 2013-10-17T05:59:59-0700 1TZY6R5HERP7SJRRYDYV 69.71.202.109 7802 41587 495307 30595 HTTP/1.1 POST /gp/ppd";
// text for 2nd pattern
String text2 = "access-1080.2013-10-17-05.us-online-cpp-portlet-live-1d-i-752c3b12.us-east-1.phnew.com.gz";
// 1st pattern - note that the "word" boundary separators are useless here, 
// but they might come in handy if you had alphanumeric Strings longer than 20 characters
Pattern accessIdPattern = Pattern.compile("\\b[A-Z0-9]{20}\\b");
Matcher m = accessIdPattern.matcher(text1);
while (m.find()) {
    System.out.println(m.group());
}
// this is trickier. I assume for your 2nd pattern you want something delimited on the
// left by a dot and starting with 2 lowercase characters, followed by a hyphen, 
// followed by a number of alnums, followed by ".com"
Pattern otherThingie = Pattern.compile("(?<=\\.)[a-z]{2}-[a-z0-9\\-.]+\\.com");
m = otherThingie.matcher(text2);
while (m.find()) {
    System.out.println(m.group());
}

输出:

1TZY6R5HERP7SJRRYDYV
us-online-cpp-portlet-live-1d-i-752c3b12.us-east-1.phnew.com
于 2013-10-17T13:57:09.313 回答