1

I have an Iterable<MyRecord> records . I iterate over the records like below and add it to a LinkedList as shown below.

for (MyRecord record: records){
    sortedList.addLast(record);
}

My iterable has 3 records, all with different values. But in the end although sortedList contains 3 records, ALL THREE ARE THE SAME!!!. How come?

When I printed out the memory location, it's the same for all 3. What am I doing wrong?


If the sortedList contains the same records, which is the last elements in original records, it's possible that the iterator re-use the tmp reference. You need to check the implementation of the 'records'.

4

4 回答 4

3

Actually your comment reveals the missing link to why this is going wrong. You're using this in a Hadoop mapper or reducer. The trick with Hadoop is that it reuses the objects you're getting in, so that it goes easy on the garbage collector. What you thus have to do is make a copy of each of the objects in your source iterable (the MyRecords), and add those to your LinkedList.

于 2012-12-20T08:02:46.970 回答
1

Your question is clear and so is the code (also after reading the comments); this may not help, but maybe you can just do, before your add, a contains check like: if (sortedList.contains(record)) sortedList.add...

I admit this might not really help (also i don't know if contains checks for element memory location under the hood, as i guess it might only check for element presence in the list using equals).

于 2012-12-20T07:58:09.380 回答
1

如果 sortedList 包含相同的记录,即原始记录中的最后一个元素,则迭代器可能会重复使用 tmp 引用。您需要检查“记录”的执行情况。

于 2012-12-20T07:51:23.133 回答
0

When you are adding items into the list put check :

if(sortedList.contains(record))
{
  System.out.println("Record is already available "+record);
}
else
{
 sortedList.addLast(record);
}

You will get to know whether the problem is due to same records or because of something else.

于 2012-12-20T08:47:41.033 回答