22

我对 HotSpot 中堆上的对象布局相当熟悉,但对 Android 来说并不熟悉。

例如,在 32 位 HotSpot JVM 中,堆上的一个对象被实现为一个 8 字节的标头,然后是对象的字段(一个字节用于boolean,四个字节用于引用,以及其他所有内容)以某种特定的顺序(对超类中的字段有一些特殊规则),并填充为 8 个字节的倍数。

我已经进行了一些研究,但找不到任何特定于 Android 的信息。

(我有兴趣优化一些极其广泛使用的数据结构,以最大限度地减少 Android 上的内存消耗。)

4

1 回答 1

20

dalvik/vm/oo/Object.h是你的朋友吗?的评论struct Object说:

/*
 * There are three types of objects:
 *  Class objects - an instance of java.lang.Class
 *  Array objects - an object created with a "new array" instruction
 *  Data objects - an object that is neither of the above
 *
 * We also define String objects.  At present they're equivalent to
 * DataObject, but that may change.  (Either way, they make some of the
 * code more obvious.)
 *
 * All objects have an Object header followed by type-specific data.
 */

java.lang.Class对象是特殊的;ClassObject它们的布局由Object.h. 数组对象很简单:

struct ArrayObject : Object {
    /* number of elements; immutable after init */
    u4              length;

    /*
     * Array contents; actual size is (length * sizeof(type)).  This is
     * declared as u8 so that the compiler inserts any necessary padding
     * (e.g. for EABI); the actual allocation may be smaller than 8 bytes.
     */
    u8              contents[1];
};

对于数组,宽度在vm/oo/Array.cpp. 布尔值的宽度为 1,对象的sizeof(Object*)长度为 4(通常为 4),所有其他基本类型都有其预期的(打包的)长度。

数据对象非常简单:

/*
 * Data objects have an Object header followed by their instance data.
 */
struct DataObject : Object {
    /* variable #of u4 slots; u8 uses 2 slots */
    u4              instanceData[1];
};

a(所有非 Class 类实例)的布局DataObjectcomputeFieldOffsetsin控制vm/oo/Class.cpp。根据那里的评论:

/*
 * Assign instance fields to u4 slots.
 *
 * The top portion of the instance field area is occupied by the superclass
 * fields, the bottom by the fields for this class.
 *
 * "long" and "double" fields occupy two adjacent slots.  On some
 * architectures, 64-bit quantities must be 64-bit aligned, so we need to
 * arrange fields (or introduce padding) to ensure this.  We assume the
 * fields of the topmost superclass (i.e. Object) are 64-bit aligned, so
 * we can just ensure that the offset is "even".  To avoid wasting space,
 * we want to move non-reference 32-bit fields into gaps rather than
 * creating pad words.
 *
 * In the worst case we will waste 4 bytes, but because objects are
 * allocated on >= 64-bit boundaries, those bytes may well be wasted anyway
 * (assuming this is the most-derived class).
 *
 * Pad words are not represented in the field table, so the field table
 * itself does not change size.
 *
 * The number of field slots determines the size of the object, so we
 * set that here too.
 *
 * This function feels a little more complicated than I'd like, but it
 * has the property of moving the smallest possible set of fields, which
 * should reduce the time required to load a class.
 *
 * NOTE: reference fields *must* come first, or precacheReferenceOffsets()
 * will break.
 */

因此,超类字段首先出现(像往常一样),然后是引用类型字段,然后是单个 32 位字段(如果可用,并且如果需要填充,因为有奇数个 32 位引用字段),然后是 64位字段。随后是常规的 32 位字段。请注意,所有字段都是 32 位或 64 位(填充较短的原语)。特别是,此时,VM 不会使用少于 4 个字节来存储 byte/char/short/boolean 字段,尽管理论上它当然可以支持这一点。

请注意,所有这些都是基于在提交时43241340(2013 年 2 月 6 日)阅读 Dalvik 源代码。由于 VM 的这一方面似乎没有公开记录,因此您不应依赖此作为对 VM 对象布局的稳定描述:它可能会随着时间而改变。

于 2013-02-08T22:18:00.063 回答