0

在 Java 中,我尝试使用正则表达式自动解释文本。

所以我需要找到一种方法,用随机生成的正则表达式匹配替换正则表达式的第一个匹配项,如下所示:

public static String paraphraseUsingRegularExpression(String textToParaphrase, String regexToUse){
    //In textToParaphrase, replace the first match of regexToUse with a randomly generated match of regexToUse, and return the modified string.
}

那么如何用随机生成的正则表达式匹配替换字符串中正则表达式的第一个匹配项呢?(也许一个名为xeger的库可以用于此目的。)

例如,paraphraseUsingRegularExpression("I am very happy today", "(very|extremely) (happy|joyful) (today|at this (moment|time|instant in time))");将用随机生成的正则表达式匹配替换正则表达式的第一个匹配,这可能会产生输出"I am extremely joyful at this moment in time",或"I am very happy at this time"

4

1 回答 1

2

您可以按照以下步骤进行操作:

首先,拆分textToParaphrase字符串,regexToUse您将得到一个数组,其中的部分textToParaphrase与提供的表达式不匹配。例如:如果,

 textToParaphrase = "I am very happy today for you";
 regexToUse = "(very|extremely) (happy|joyful) (today|at this (moment|time|instant in time))";

输出将是:{"I am ", "for you"}。然后使用这些生成的字符串(如 )创建一个正则表达式"(I am |for you)"。现在再次textToParaphrase使用这个生成的表达式拆分,您将获得给定正则表达式匹配部分的数组。最后,您将每个匹配的部分替换为随机生成的字符串。

代码如下:

public static String paraphraseUsingRegularExpression(String textToParaphrase, String regexToUse){
    String[] unMatchedPortionArray = textToParaphrase.split(regexToUse);
    String regExToFilter = "(";
    for(int i = 0; i< unMatchedPortionArray.length; i++){
        if(i == unMatchedPortionArray.length -1){
            regExToFilter+=unMatchedPortionArray[i];
        } else {
            regExToFilter+=unMatchedPortionArray[i]+"|";
        }
    }
    regExToFilter+=")";

    String[] matchedPortionArray = textToParaphrase.split(regExToFilter);
    Xeger generator = new Xeger(regexToUse);
    for (String matchedSegment : matchedPortionArray){
    String result = generator.generate(); //generates randomly (according to you!)
        textToParaphrase = textToParaphrase.replace(matchedSegment, result);
    }
    return textToParaphrase;
}

干杯!

于 2013-05-18T18:07:03.723 回答