1

我正在编写一个应用许多计算语言学原理的程序。我现在的问题是下面的代码形成了一种“灵活的两个定义”的方法。也就是说,它比较同一个词的两个不同定义,并且在每个定义中添加空格或空格,以便稍后使用更改后的定义(添加空格)。假设我们有以下两个定义,定义术语“自由落体”。

1) Free fall descent  of a body subjected only to            the   action of  gravity.
2) Free fall movement of a body in        a    gravitational field under  the influence of gravity

有一个名为 stoplist 的单词列表,其中包含以下单词:“of”、“a”、“in”、“to”和“under”。在此过程之后,定义中也包含在停止列表中的每个单词都必须对应于一个空格或另一个定义的另一个停止列表单词。所以在执行这样的过程之后,前面的定义,在两个不同的列表中表示,应该是这样的:

1) Free fall descent  of a body ____ ____ subjected     only  to     the action    of gravity.
2) Free fall movement of a body in   a    gravitational field under  the influence of gravity.

我为实现此目的而编写的代码如下:

[...]

String[] sList = STOPLIST.split(" ");  //this is the stoplist
String[] definition1 = defA1.split(" ");  //this is the array of words of the first definition
String[] definition2 = defA2.split(" ");  //this is the array of words of the second definition
List<String> def1 = new ArrayList<String>();  
List<String> def2 = new ArrayList<String>();
List<String> stopList = new ArrayList<String>();

for(String word : definition1){
    def1.add(word); //I transform arrays into lists this way because I used to think that using .asList() was the problem.
}
for(String word : definition2){
    def2.add(word);
}
for(String word : sList){
    stopList.add(word);
}

int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); //here mdef will have the value of the lenght of the shortest definition, and we are going to use the value of mdef to iterate later on.

for(int i = 0; i < mdef; i++){
   if (stopList.contains(def1.get(i))) {  //here I check if the first word of the first definition is also found in the stoplist.
        if (!stopList.contains(def2.get(i))) {  //If the word of def1 previously checked is in the stoplist, as well as the corresponding word in the second definition, then we won't add a " "(blank) space in the corresponding position of the second definition.
           def2.add(i , " "); //here I add that blank space, only if the stoplist word in def1 corresponds to a non-stoplist word in def2. Again, we do this so the stoplist word in def1 corresponds to a blank space OR another stoplist word in def2.
           if(mdef == def2.size())
               mdef++; //In case the shortest definition is the definition to which we just added spaces, we increment mdef++, because that space added increases the length of the shortest definition, and to iterate in this recenlty extended definiton, we have to increment the index with which we iterate.
        }
    } else if (stopList.contains(def2.get(i))) { //this else if does the same than the previous one, but checks for the second definition instead of the first one. And adds blanks to def1 instead of def2 if necessary.
        if (!stopList.contains(def1.get(i))) {
            def1.add(i , " ");
            if(mdef == def1.size())
                mdef++;
        }
    }
}

[...]

现在,如果您仔细分析代码,您将意识到并非最长列表中的所有单词都会被检查,因为我们使用最短定义的长度作为索引来迭代定义。这很好,不必检查最长定义的剩余单词,它们将对应于另一个定义的空空格(如果列表在添加空格后最终不具有相同的长度,如前面的例子所示)。

现在,经过解释,问题如下:在调用包含前面代码的方法的主类后,弹出运行时异常:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 1, Size: 0
    at java.util.ArrayList.rangeCheck(ArrayList.java:571)
    at java.util.ArrayList.get(ArrayList.java:349)
    at main2.main(main2.java:75)

我不明白为什么它发现任何列表都是“空的”。我尝试了太多方法来解决它,我希望我给出了一个很好的解释。

如果我将 mdef 分配给最长的尺寸而不是最短的尺寸,这可能会有所帮助,即:

int mdef = (def1.size() >= def2.size()) ? def1.size() : def2.size();

错误更改为:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 15, Size: 15
    at java.util.ArrayList.rangeCheck(ArrayList.java:571)
    at java.util.ArrayList.get(ArrayList.java:349)
    at asmethods.lcc.turnIntoFlex(lcc.java:55)
    at asmethods.lcc.calLcc(lcc.java:99)
    at main2.main(main2.java:73)' 

其中 lcc 是包含方法 turnIntoFlex 的类,该方法包含我正在显示的代码段。“turnIntoFlex”的第55行对应循环的第一行,即:

if (stopList.contains(def1.get(i))) { [...]

注释: defA1 和 defA2 的值分别是定义。即 def1 和 def2 最初是列表,其中每个单独的元素都是一个单词。我无法检查这些列表是否正在通过打印它们来填充,因为 indexoutofboundsexception 在循环开始的那一刻弹出。但是,我确实打印了 mdef、def1.size() 和 def2.size() 的大小值,结果是 13 或 15,表明在“for”循环开始之前没有列表为空.

mdef++ 是我最近添加的,并不是为了完全解决这个特定问题,但是在我添加 mdef++ 部分之前,错误一直在出现。正如我所解释的,目的是在扩展最短列表时增加 mdef++(但仅在扩展短列表时),因此我们遍历短列表的所有单词,而不是更多。

4

3 回答 3

0

伙计,我想我明白了。我修改了代码,但我希望你明白我做了什么:

static public void main(String[] argv) {
    String[] sList = "of a in to under".split(" ");
    String[] definition1 = "Free fall descent of a body subjected only to the action of gravity"
            .split(" ");
    String[] definition2 = "Free fall movement of a body in a gravitational field under the influence of gravity"
            .split(" ");
    List<String> def1 = new ArrayList<String>();
    List<String> def2 = new ArrayList<String>();
    List<String> stopList = new ArrayList<String>();

    for (String word : definition1) {
        def1.add(word);
    }
    for (String word : definition2) {
        def2.add(word);
    }
    for (String word : sList) {
        stopList.add(word);
    }

    int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // Shortest
                                                                            // length

    for (int i = 0; i < mdef; i++) {
        System.out.println(i);
        if (!stopList.contains(def1.get(i)) && !stopList.contains(def2.get(i))) {
            continue;
        }

        else if (stopList.contains(def1.get(i)) && stopList.contains(def2.get(i))) {
            continue;
        }

        else if (!stopList.contains(def1.get(i)) && stopList.contains(def2.get(i))) {
            def1.add(i, " ");
            mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // define mdef again
        }

        else if (stopList.contains(def1.get(i)) && !stopList.contains(def2.get(i))) {
            def2.add(i, " ");
            mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // define mdef again
        }

    }

    for (String word : def1) {

        if (word.equals(" "))
            System.out.print("_ ");
        else
            System.out.print(word+" ");
    }

    System.out.println();

    for (String word : def2) {
        if (word.equals(" "))
            System.out.print("_ ");
        else
            System.out.print(word+" ");
    }           
}
于 2013-10-27T05:58:32.040 回答
0

您的代码的一个问题是,当您增加时,mdef您不会检查它现在是否超过了另一个列表的长度。

例如,假设def1有 3 个单词和def24 个单词。 mdef将从 3 开始。但是假设您连续添加两个空格def1并增加mdef两次为 5。这现在超过了 5 的长度,并且如果您继续迭代到 5,则会def2在条件下导致索引越界异常。def2 else


后来补充:

您的代码的另一个严重问题(我稍后想到)是,当您将空格添加到列表(或def1def2)时,这会将所有后续元素的索引向上移动 1。因此,例如,如果您添加def1当点 0 处的空间为 0时i,然后在下一次通过循环时,递增i到 1,您将看到与def1上一次通过时相同的单词。这可能是您的一些异常的来源(因为它会导致持续循环,直到您超过另一个列表的长度:上面的问题 #1)。


要纠正这两个问题,您需要将代码更改为:

int i = 0;
int j = 0;
while (i < def1.size()  &&  j < def2.size()) {
    if (stopList.contains(def1.get(i)) && !stopList.contains(def2.get(j)))
        def2.add(j++, " ");
    else if (stopList.contains(def2.get(j)) && !stopList.contains(def1.get(i)))
        def1.add(i++, " ");
    ++i;
    ++j;
}

请注意,您mdef在此实现中不再需要任何内容​​。

于 2013-10-27T04:47:28.013 回答
-2

这是您使用的确切代码吗?我刚刚运行它,它运行良好,我使用:

import java.util.*;

public class HelloWorld {

    public static void main(String []args) {
        String stoplist= "of a in to and under";
        String defA1 = "Free fall descent  of a body subjected only to            the   action of  gravity";
        String defA2 = "Free fall movement of a body in        a    gravitational field under  the influence of gravity";

        String[] sList = stoplist.split(" ");  //this is the stoplist
        String[] definition1 = defA1.split(" ");  //this is the array of words of the first definition
        String[] definition2 = defA2.split(" ");  //this is the array of words of the second definition
        List<String> def1 = new ArrayList<String>();
        List<String> def2 = new ArrayList<String>();
        List<String> stopList = new ArrayList<String>();

        for (String word : definition1) {
            def1.add(word); //I transform arrays into lists this way because I used to think that using .asList() was the problem.
        }
        for (String word : definition2) {
            def2.add(word);
        }
        for (String word : sList) {
            stopList.add(word);
        }

        int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); //here mdef will have the value of the lenght of the shortest definition, and we are going to use the value of mdef to iterate later on.

        for (int i = 0; i < mdef; i++) {
            if (stopList.contains(def1.get(i))) {  //here I check if the first word of the first definition is also found in the stoplist.
                if (!stopList.contains(def2.get(i))) {  //If the word of def1 previously checked is in the stoplist, as well as the corresponding word in the second definition, then we won't add a " "(blank) space in the corresponding position of the second definition.
                    def2.add(i , " "); //here I add that blank space, only if the stoplist word in def1 corresponds to a non-stoplist word in def2. Again, we do this so the stoplist word in def1 corresponds to a blank space OR another stoplist word in def2.
                    if (mdef == def2.size())
                        mdef++; //In case the shortest definition is the definition to which we just added spaces, we increment mdef++, because that space added increases the length of the shortest definition, and to iterate in this recenlty extended definiton, we have to increment the index with which we iterate.
                }
            } else if (stopList.contains(def2.get(i))) { //this else if does the same than the previous one, but checks for the second definition instead of the first one. And adds blanks to def1 instead of def2 if necessary.
                if (!stopList.contains(def1.get(i))) {
                    def1.add(i , " ");
                    if (mdef == def1.size())
                        mdef++;
                }
            }
        }

        for (String word : def1) {
            System.out.print(word+",");
        }

        System.out.println();

        for (String word : def2) {
            System.out.print(word+",");
        }
    }
}
于 2013-10-27T05:43:04.303 回答