java - 构建更好的 ArrayList

Question

当您ArrayList在列表的开头或中间插入时，您总是会有O(n)性能，但末尾的插入能够保持O(1)，因为您只是将它附加到列表的末尾。

我对如何在开头O(1)进行O(n)插入并平均进行插入有一个高级的想法，O(n/2)但我不确定底层数组结构是否可行。

我的点子

由于ArrayList实际上只是一个更大的数组，当它达到阈值时它会“调整”自身的大小，为什么它不能在列表的开头和结尾而不是仅在结尾处调整自己的大小？这将允许在开头插入，O(1)同时还允许您在中间重写插入以采用最短路径插入自身。（例如，如果您在 index 处插入，则移动比移动到1更容易）。02n

我认为 Sun 最初为什么不这样做的答案是因为Arrays.copyOfand by extensionSystem.arrayCopy的编写方式。两者都从列表的开头开始并复制到更大的数组中。我已经查看了这两种方法的 Sun 源代码，但对于我的 Java 技能水平来说，它有点先进。

问题

在我看来，这是否可行有两个主要问题：

1）首先，是否可以Arrays.copyOf重写System.arrayCopy/重载以将子数组插入到更大数组的中间？

2) Big O 这样做的后果是什么？

那么这是一个好主意还是我错过了一些明显会使这不可行的东西？

提前致谢！

score 2 · Accepted Answer

您的想法是完全可行的，并且对于双端队列实现很有意义，其中常见的情况是在两端添加和删除元素。这对一个没有多大意义，ArrayList因为它的 99% 的用例只会添加到最后。

至于你关于性质的问题System.arraycopy，我不确定我是否完全理解你，因为

a）是的，您可以使用另一个数组的内容覆盖一个数组的中间arraycopy，并且

b)arraycopy不插入数组，因为该操作对 Java 的固定大小数组没有意义。

最接近“插入数组”的想法是创建一个全新的、更大的数组，并 arraycopy从两个输入数组中获取适当的范围。由此产生的性能是你能得到的最好的。

score 1 · Accepted Answer

您所描述的与ArrayList.

an 元素的ArrayList顺序由它们插入的顺序定义（在这种情况下，您将始终将它们添加到末尾，当然除非您指定索引）。

如果顺序是由对象成员的值而不是插入顺序定义的，那么您需要一个LinkedList.

score 1 · Accepted Answer

是的，您可以在开头插入和在结尾插入一样有效。您只需要在中间而不是索引 0 处开始填充后备数组。为此，您不需要任何不同的 System.arraycopy，请参见下面的示例代码。

不仅存储后备数组 ( data) 和 current size，还存储当前start偏移量，并相应地考虑它。

BetterArrayList<T> extends AbstractList<T> {
  private int size;   // Used elements in data.
  private int start;  // Position of element 0 in data.
  private T[] data;   // data.length is the current total capacity.

  public BetterArrayList<T>(int initialCapacity) {
    data = new Object[initialCapacity];
    start = initialCapacity / 2;
    size = 0;
  }

  public void add(int index, T value) {
    if (index < size / 2) {  // Insert at the front or end?
      if (start == 0) {
        realloc(data.length * 2);
      }
      System.arraycopy(data, start, data, start - 1, index);
      start--;
    } else {
      if (size + start == data.length) {
        realloc(data.length * 2);
      }
      System.arraycopy(data, start + index, 
                       data, start + index + 1, size - index);
    }
    data[start + index] = value;
    size++;
  }    

  public T get(int index) {
    if (index < 0 || index >= size) {
      throw new IndexOutOfBoundsException();
    }
    return data[start + index];
  }

  private void realloc(int newCapacity) {
    T[] newData = new Object[newCapacity];
    int newStart = newCapacity - size / 2;
    System.arraycopy(data, start, newData, newStart, size);
    start = newStart;
    data = newData;
  }

  public T set(int index, T value) {
    Value old = get(index);
    data[size + index] = value;
    return old;
  }

  public int size() {
    return size;
  }
}

我猜 Sun 没有走这条路，因为它在数组列表的典型用例中浪费了更多内存（在末尾追加）。而且您只会在前面获得更有效的插入，在其他任何地方插入仍然是 O(n) 因为内容仍然需要移动。

顺便说一句：取决于您的用例，但您可能对 Rope 数据结构感兴趣：http ://en.wikipedia.org/wiki/Rope_%28data_structure%29

java - 构建更好的 ArrayList

3 回答 3

Related

Reference