java - 如何引用数组的一部分？

Question

给定一个对象byte[]，当我们想要对这样的对象进行操作时，我们通常需要它的一部分。在我的特定示例中，我byte[]从线中获取，其中前 4 个字节描述消息的长度，然后是另外 4 个字节的消息类型（映射到具体 protobuf 类的整数），然后剩余byte[]的是消息的实际内容......像这样

length|type|content

为了解析此消息，我必须将内容部分传递给知道如何从中解析实例的特定类...问题是通常没有提供任何方法，因此您可以指定解析器从何处读取大批...

所以我们最终要做的是复制该数组的剩余 chuks，这是无效的......

据我所知，在java中不可能创建另一个byte[]引用，它实际上引用了一些byte[]只有2个索引的原始更大数组（这是导致内存泄漏的String方法）......

我想知道我们如何解决这样的情况？我想放弃protobuf只是因为它没有提供一些parseFrom(byte[], int, int)没有意义...... protobuf只是一个例子，任何东西都可能缺少那个api......

那么这是否会迫使我们编写低效的代码，或者有什么可以做的呢？（除了添加该方法）...

score 2 · Accepted Answer

通常你会用流来处理这种事情。

流是一种抽象，用于读取处理当前数据块所需的内容。因此，您可以将正确数量的字节读入字节数组并将其传递给您的解析函数。

你问“这是否会迫使我们编写低效的代码，或者有什么可以做的？”

通常您以流的形式获取数据，然后使用下面演示的技术会更高效，因为您跳过了制作一个副本。（两个副本而不是三个副本；一次由操作系统，一次由您。在开始解析之前，您跳过制作总字节数组的副本。）如果您实际上从 a 开始，byte[]但它是由您自己构建的，那么您可能想要改为构造一个对象，例如{ int length, int type, byte[] contentBytes }并传递contentBytes给您的解析函数。

如果你真的，真的必须开始，byte[]那么下面的技术只是一种更方便的解析方式，它不会更高效。

因此，假设您从某个地方获得了一个字节缓冲区，并且您想读取该缓冲区的内容。首先将其转换为流：

private static List<Content> read(byte[] buffer) {
    try {
        ByteArrayInputStream bytesStream = new ByteArrayInputStream(buffer);
        return read(bytesStream);
    } catch (IOException e) {
        e.printStackTrace();
    }
}

上面的函数用流包装字节数组并将其传递给执行实际读取的函数。如果您可以从流开始，那么显然您可以跳过上述步骤，直接将该流传递给以下函数：

private static List<Content> read(InputStream bytesStream) throws IOException {
    List<Content> results = new ArrayList<Content>();
    try {
        // read the content...
        Content content1 = readContent(bytesStream);
        results.add(content1);

        // I don't know if there's more than one content block but assuming
        // that there is, you can just continue reading the stream...
        //
        // If it's a fixed number of content blocks then just read them one
        // after the other... Otherwise make this a loop
        Content content2 = readContent(bytesStream);
        results.add(content2);
    } finally {
        bytesStream.close();
    }
    return results;
}

由于您的字节数组包含内容，您将需要从流中读取内容块。由于您有一个长度和一个类型字段，我假设您有不同类型的内容块。下一个函数读取长度和类型，并根据读取的类型将内容字节的处理传递给适当的类：

private static Content readContent(InputStream stream) throws IOException {
    final int CONTENT_TYPE_A = 10;
    final int CONTENT_TYPE_B = 11;

    // wrap the InputStream in a DataInputStream because the latter has
    // convenience functions to convert bytes to integers, etc.
    // Note that DataInputStream handles the stream in a BigEndian way,
    // so check that your bytes are in the same byte order. If not you'll
    // have to find another stream reader that can convert to ints from
    // LittleEndian byte order.
    DataInputStream data = new DataInputStream(stream);
    int length = data.readInt();
    int type = data.readInt();

    // I'm assuming that above length field was the number of bytes for the
    // content. So, read length number of bytes into a buffer and pass that 
    // to your `parseFrom(byte[])` function 
    byte[] contentBytes = new byte[length];
    int readCount = data.read(contentBytes, 0, contentBytes.length);
    if (readCount < contentBytes.length)
        throw new IOException("Unexpected end of stream");

    switch (type) {
        case CONTENT_TYPE_A:
            return ContentTypeA.parseFrom(contentBytes);
        case CONTENT_TYPE_B:
            return ContentTypeB.parseFrom(contentBytes);
        default:
            throw new UnsupportedOperationException();
    }
}

我已经组成了以下内容类。我不知道protobuf它是什么，但它显然可以使用它的函数从字节数组转换为实际对象parseFrom(byte[])，所以把它当作伪代码：

class Content {
    // common functionality
}

class ContentTypeA extends Content {
    public static ContentTypeA parseFrom(byte[] contentBytes) {
        return null; // do the actual parsing of a type A content 
    }
}

class ContentTypeB extends Content {
    public static ContentTypeB parseFrom(byte[] contentBytes) {
        return null; // do the actual parsing of a type B content
    }
}

score 1 · Accepted Answer

在 Java 中，Array 不仅仅是内存的一部分 - 它是一个对象，具有一些附加字段（至少 - 长度）。所以你不能链接到数组的一部分——你应该：

使用数组复制函数或
实现并使用一些仅使用部分字节数组的算法。

score 1 · Accepted Answer

担心似乎没有办法在数组上创建视图（例如，等效于的数组List#subList()）。一种解决方法可能是让您的解析方法采用对整个数组和两个索引（或一个索引和一个长度）的引用来指定该方法应该处理的子数组。

这不会阻止方法读取或修改它们不应触及的数组部分。如果这是一个问题，也许可以创建一个ByteArrayView类来增加一点安全性：

public class ByteArrayView {
  private final byte[] array;
  private final int start;
  private final int length;

  public ByteArrayView(byte[] array, int start, int length) { ... }

  public byte[] get(int index) {
    if (index < 0 || index >= length) {
      throw new ArrayOutOfBoundsExceptionOrSomeOtherRelevantException();
    }
    return array[start + index];
  }
}

但是，另一方面，如果性能是一个问题，那么get()用于获取每个字节的方法调用可能是不可取的。

该代码用于说明；它没有经过测试或任何东西。

编辑

在第二次阅读我自己的答案时，我意识到我应该指出这一点：ByteArrayView将复制您从原始数组中读取的每个字节 - 只是逐字节而不是作为一个块。这不足以解决 OP 的担忧。

java - 如何引用数组的一部分？

3 回答 3

Related

Reference