1

假设我有一个具有以下签名的方法:

public int indexOf(byte[] bytes, byte toFind, int offset, int length) {
  ...
}

此方法执行一些简单的操作,例如toFind在 [offset, offset+length) in 范围内查找字节bytes。我想预先检查偏移量和长度是否对bytes. 也就是说,偏移量和偏移量+长度以字节为单位。

显式检查类似于:

if (offset < 0 || offset > bytes.length - length) {
  throw ...;  // bad santa!
}

似乎我可以通过执行“虚拟”数组访问来更便宜地执行此操作(就发出的字节码而言,也许还有运行时性能):

public int indexOf(byte[] bytes, byte toFind, int offset, int length) {
  int dummy = bytes[offset] + bytes[offset + length - 1];
  ...
}

int dummy如果可以的话,我想摆脱+它们,或者降低它们的成本。编译器不喜欢独立访问bytes[offset];,大概是因为这样的表达式通常没有副作用并且毫无意义(但在这种情况下并非如此)。使用 dummy int 还会导致必须抑制的编译器警告。

关于如何用最少的字节码进行更改的任何建议(运行时性能在这里也很重要,但我怀疑大多数解决方案都被优化为与丢弃未使用部分相同的东西)。

4

4 回答 4

2

这个怎么样 ?

if ((bytes[offset] | bytes[offset+length-1])==0) { }
于 2013-01-03T23:11:37.263 回答
1

I hope that just executing bytes[offset] and bytes[offset + length - 1] is the cheapest way. The shortest way in JVM bytecode would be just to execute these expressions and leave it on the operand stack.

However, you can't do it in Java. You also can't use pop2 instruction (or two pop instructions), because bytes[something] is not a valid Java command. There are three potentially best ways:

  1. Use a method call like int java.lang.Math.max(int, int). This adds one 3-byte invokestatic instruction and one 1-byte pop instruction. So, it is a 4-byte overhead. You can save one byte, if you write a static dummy method with two int arguments and void result. An intelligent JVM optimizer would probably reduce this code to one pop2 instruction, since Math.max(...) has no side effect and you discard the result by pop instruction. However, I am not sure if this applies for Hotspot.
  2. Assign it to a local variable. One assign means one istore instruction. If you have five parameters (including this, because the method is not static), you use the generic 2-byte istore version instead of 1-byte istore_<n> (for n in {0, 1, 2, 3}). If you had at most three parameters, you would probably save something by reducing scope of the dummy variable.
  3. Compare it (=> generate boolean) and use an empty branch, i.e. if ((bytes[offset] == bytes[offset+length-1])) { }. In this case, you don't need any extra method (like max or pop2) or any extra local variable (which enlarges local variable table).

If you don't use any further optimizer and you don't modify the method signature to use less variables, the third way is probably a winner. In my simple test, it requires only 16 bytes for instrutions (some other implementations are equal, but not better) and does not require anything more in local variable table or constant pool. You probably can save several bytes by manual bytecode optimisations or by Proguard. But be careful, Proguard may optimize it too much and remove the array access. (I am not sure, but it claims in documentation that it may remove some NullPointerExceptions.)

See https://gist.github.com/4523924

于 2013-01-13T12:49:49.707 回答
1

就字节码长度而言,最便宜的方法是多个 JRE 类使用的方法,例如ByteBufferArrayList使用专用检查方法

从 Java 9 开始,有一个用于此目的的标准方法,它也开始取代这些内部检查方法,成为此类检查的中心位置,如果 JVM 的优化器有一种优化方法,它也知道它查看:
Objects.checkFromIndexSize​(offset, length, bytes.length);


与其他方法相比:

  • 使用带有虚拟变量的数组访问:

    public int indexOf1(byte[] bytes, byte toFind, int offset, int length) {
        int dummy = bytes[offset] + bytes[offset + length - 1];
        //...
    }
    

    编译为

     0: aload_1
     1: iload_3
     2: baload
     3: aload_1
     4: iload_3
     5: iload         4
     7: iadd
     8: iconst_1
     9: isub
    10: baload
    11: iadd
    12: istore        5
    14: ...
    
  • 使用数组访问和虚拟分支指令

    public int indexOf2(byte[] bytes, byte toFind, int offset, int length) {
        if ((bytes[offset] | bytes[offset+length-1])==0) { }
        //...
    }
    

    编译为

     0: aload_1
     1: iload_3
     2: baload
     3: aload_1
     4: iload_3
     5: iload         4
     7: iadd
     8: iconst_1
     9: isub
    10: baload
    11: ior
    12: ifne          15
    15: ...
    
  • 使用专用检查方法

    public int indexOf3(byte[] bytes, byte toFind, int offset, int length) {
        checkIndex(bytes, offset, length);
        //...
    }
    
    private void checkIndex(byte[] bytes, int offset, int length) {
       //...
    }
    

    编译为

     0: aload_0
     1: aload_1
     2: iload_3
     3: iload         4
     5: invokespecial #23                 // Method checkIndex:([BII)V
     8: ...
    

因此,在考虑使用它的代码时,委托会获胜。它也没有在调用方引入局部变量。只要一种以上的方法使用它,实现方法所需的额外空间就会得到回报。这样的方法通常是privateor static,由没有动态调度的指令调用并在运行时内联,因此不会有性能差异。JVM 通常会内联那么小的方法,无论它们是否是热点。

在将隐式数组边界检查的性能与显式比较进行比较时,没有任何理由应该比另一个更快。它们基本上是在做同样的事情,在任何一种情况下,如果 JVM 可以证明调用者总是只传递有效数字,它就可以忽略它们。

顺便说一句,Buffer.checkBounds引导您实现只有一个条件:

private void checkIndex(byte[] bytes, int offset, int length) {
    if((offset | length | (offset+length) | (bytes.length-(offset+length))) < 0)
        throw new IndexOutOfBoundsException();
}

与数组访问变体不同,这也处理length负数的情况(但offset+length会产生有效的索引)。

于 2017-01-19T13:36:54.057 回答
0

不确定字节码长度,但如何:

bytes[offset] |= bytes[offset];
bytes[offset + length - 1] |= bytes[offset + length - 1];
于 2013-01-03T23:07:35.623 回答