java - 从Java中的字符串中提取包含符号的单词

Question

基本思想是我想以“text1.text2”的形式提取字符串的任何部分。我想做的输入和输出的一些例子是：

"employee.first_name" ==> "employee.first_name"
"2 * employee.salary AS double_salary" ==> "employee.salary"

到目前为止，我只有 .split(" ")，然后找到了我需要的东西和 .split(".")。有没有更清洁的方法？

score 2 · Accepted Answer

我会使用实际Pattern的迭代查找，而不是拆分String.

例如：

String test = "employee.first_name 2 * ... employee.salary AS double_salary blabla e.s blablabla";
// searching for a number of word characters or puctuation, followed by dot, 
// followed by a number of word characters or punctuation
// note also we're avoiding the "..." pitfall
Pattern p = Pattern.compile("[\\w\\p{Punct}&&[^\\.]]+\\.[\\w\\p{Punct}&&[^\\.]]+");
Matcher m = p.matcher(test);
while (m.find()) {
    System.out.println(m.group());
}

输出：

employee.first_name
employee.salary
e.s

注意：为了简化Pattern你只能列出允许的标点符号，形成你的“。” - 类别中的分隔词

例如：

Pattern p = Pattern.compile("[\\w_]+\\.[\\w_]+");

这样，foo.bar*2将匹配为foo.bar

score 1 · Accepted Answer

您需要使用split将字符串分解为片段。然后.使用方法在每个片段中搜索contains，以获得所需的片段：

干得好：

public static void main(String args[]) {
    String str = "2 * employee.salary AS double_salary";
    String arr[] = str.split("\\s");
    for (int i = 0; i < arr.length; i++) {
        if (arr[i].contains(".")) {
            System.out.println(arr[i]);
        }
    }
}

score 0 · Accepted Answer

我不是 JAVA 方面的专家，但由于我在 python 中使用了正则表达式并基于互联网教程，我为您提供r'(\S*)\.(\S*)'作为模式使用。我在 python 中尝试过，它在你的示例中运行良好。

但是如果你想连续使用多个点，它就有一个错误。我的意思是，如果您尝试匹配类似的内容first.second.third，则此模式标识('first.second', 'third')为匹配组，我认为它与最佳匹配策略有关。

score 0 · Accepted Answer

String mydata = "2 * employee.salary AS double_salary";
pattern = Pattern.compile("(\\w+\\.\\w+)");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find())
{
  System.out.println(matcher.group(1));
}

java - 从Java中的字符串中提取包含符号的单词

4 回答 4

Related

Reference