您可以让您的对象记住它是否已放入该哈希集中。如果将其添加到哈希集中,只需存储一个布尔字段即可。然后,您无需在 HashSet 上调用 contains,只需读取对象的字段值即可。此方法仅在对象被放入一个将检查布尔字段的哈希集中时才有效。
它可以扩展到使用java.util.BitSet
包含在哈希集中的对象中的恒定数量的哈希集,其中当哈希集的数量在算法开始之前已知时,每个哈希集可以由唯一的整数标识。
由于您说您contains
经常调用,因此用相等的现有对象(对象池)替换新生成的对象是有意义的,因为它的开销将通过仅包含单个字段读取来分摊。
根据要求,这里是一些示例代码。特殊集的实现比我机器上的普通哈希集快约 4 倍。然而,问题是这段代码在多大程度上反映了您的用例。
public class FastSetContains {
public static class SetContainedAwareObject {
private final int state;
private boolean contained;
public SetContainedAwareObject(int state) {
this.state = state;
}
public void markAsContained() {
contained = true;
}
public boolean isContained() {
return contained;
}
public void markAsRemoved() {
contained = false;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + state;
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
SetContainedAwareObject other = (SetContainedAwareObject) obj;
if (state != other.state)
return false;
return true;
}
}
public static class FastContainsSet extends
HashSet<SetContainedAwareObject> {
@Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
if (obj.isContained()) {
return true;
}
return super.contains(o);
}
@Override
public boolean add(SetContainedAwareObject e) {
boolean add = super.add(e);
e.markAsContained();
return add;
}
@Override
public boolean addAll(Collection<? extends SetContainedAwareObject> c) {
boolean addAll = super.addAll(c);
for (SetContainedAwareObject o : c) {
o.markAsContained();
}
return addAll;
}
@Override
public boolean remove(Object o) {
boolean remove = super.remove(o);
((SetContainedAwareObject) o).markAsRemoved();
return remove;
}
@Override
public boolean removeAll(Collection<?> c) {
boolean removeAll = super.removeAll(c);
for (Object o : c) {
((SetContainedAwareObject) o).markAsRemoved();
}
return removeAll;
}
}
private static final Random random = new Random(1234L);
private static final int additionalObjectsPerIteration = 10;
private static final int iterations = 100000;
private static final int differentObjectCount = 100;
private static final int containsCountPerIteration = 50;
private static long nanosSpentForContains;
public static void main(String[] args) {
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool = new HashMap<>();
// switch comment use different Set implementaiton
//Set<SetContainedAwareObject> set = new FastContainsSet();
Set<SetContainedAwareObject> set = new HashSet<>();
//warm up
for (int i = 0; i < 100; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
objectPool.clear();
set.clear();
nanosSpentForContains = 0L;
for (int i = 0; i < iterations; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
System.out.println("nanos spent for contains: " + nanosSpentForContains);
}
private static void callSetContainsForSomeObjects(
Set<SetContainedAwareObject> set) {
int containsCount = set.size() > containsCountPerIteration ? set.size()
: containsCountPerIteration;
int[] indexes = new int[containsCount];
for (int i = 0; i < containsCount; i++) {
indexes[i] = random.nextInt(set.size());
}
Object[] elements = set.toArray();
long start = System.nanoTime();
for (int index : indexes) {
set.contains(elements[index]);
}
long end = System.nanoTime();
nanosSpentForContains += (end - start);
}
private static void addAdditionalObjects(
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool,
Set<SetContainedAwareObject> set) {
for (int i = 0; i < additionalObjectsPerIteration; i++) {
SetContainedAwareObject object = new SetContainedAwareObject(
random.nextInt(differentObjectCount));
SetContainedAwareObject pooled = objectPool.get(object);
if (pooled == null) {
objectPool.put(object, object);
pooled = object;
}
set.add(pooled);
}
}
}
另一个编辑:
使用以下作为 Set.contains 实现使其比普通哈希集快约 8 倍:
@Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
return obj.isContained();
}
编辑:
这种技术与 OpenJPA 的类增强有一些共同点。OpenJPA 的增强使类能够跟踪其由实体管理器使用的持久状态。建议的方法使对象能够跟踪其自身是否包含在算法使用的集合中。