2

我有以下课程。

class MyObject implements Serializable {
    private String key;
    private String val;
    private int num;

    MyObject(String a, String b, int c) {
        this.key = a;
        this.val = b;
        this.num = c;
    }
}

我需要创建一个对象列表,重复调用以下方法(比如 10K 次或更多)

public void addToIndex(String a, String b, int c) {
    MyObject ob = new MyObject(a,b,c);
    list.add(ob); // List<MyObject>
}

我使用分析器来查看内存占用,并且由于每次创建对象而增加了很多。有没有更好的方法来做到这一点?我正在将列表写入磁盘。

编辑:这就是我在列表完全填充后编写的方式。一旦内存超过一个值(列表的大小),有没有办法追加。

ObjectOutputStream oos = new ObjectOutputStream(
                        new DeflaterOutputStream(new FileOutputStream(
                                list)));
                oos.writeObject(list);
                oos.close();
4

2 回答 2

5

我使用分析器来查看内存占用,并且由于每次创建对象而增加了很多。有没有更好的方法来做到这一点?

在您的情况下,Java 序列化不会使用那么多内存。这样做会产生大量垃圾,远远超出您的想象。它还有一个非常详细的输出,可以像你一样使用压缩来改进。

改善这种情况的一个简单方法是使用 Externalizable 而不是 Serializable。这可以显着减少产生的垃圾并使其更紧凑。使用较低的开销也可以更快。

顺便说一句,如果您对列表本身使用自定义序列化,您可以获得更好的性能。

public class Main {
    public static void main(String[] args) throws IOException, ClassNotFoundException {
        List<MyObject> list = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
            list.add(new MyObject("key-" + i, "value-" + i, i));
        }

        for (int i = 0; i < 10; i++) {
            timeJavaSerialization(list);
            timeCustomSerialization(list);
            timeCustomSerialization2(list);
        }
    }

    private static void timeJavaSerialization(List<MyObject> list) throws IOException, ClassNotFoundException {
        File file = File.createTempFile("java-serialization", "dz");
        long start = System.nanoTime();
        ObjectOutputStream oos = new ObjectOutputStream(
                new DeflaterOutputStream(new FileOutputStream(file)));
        oos.writeObject(list);
        oos.close();
        ObjectInputStream ois = new ObjectInputStream(
                new InflaterInputStream(new FileInputStream(file)));
        Object o = ois.readObject();
        ois.close();
        long time = System.nanoTime() - start;
        long size = file.length();
        System.out.printf("Java serialization uses %,d bytes and too %.3f seconds.%n",
                size, time / 1e9);
    }

    private static void timeCustomSerialization(List<MyObject> list) throws IOException {
        File file = File.createTempFile("custom-serialization", "dz");
        long start = System.nanoTime();
        MyObject.writeList(file, list);
        Object o = MyObject.readList(file);
        long time = System.nanoTime() - start;
        long size = file.length();
        System.out.printf("Faster Custom serialization uses %,d bytes and too %.3f seconds.%n",
                size, time / 1e9);
    }

    private static void timeCustomSerialization2(List<MyObject> list) throws IOException {
        File file = File.createTempFile("custom2-serialization", "dz");
        long start = System.nanoTime();
        {
            DataOutputStream dos = new DataOutputStream(new BufferedOutputStream(
                    new DeflaterOutputStream(new FileOutputStream(file))));
            dos.writeInt(list.size());
            for (MyObject mo : list) {
                dos.writeUTF(mo.key);
            }
            for (MyObject mo : list) {
                dos.writeUTF(mo.val);
            }
            for (MyObject mo : list) {
                dos.writeInt(mo.num);
            }
            dos.close();
        }
        {
            DataInputStream dis = new DataInputStream(new BufferedInputStream(
                    new InflaterInputStream(new FileInputStream(file))));
            int len = dis.readInt();
            String[] keys = new String[len];
            String[] vals = new String[len];
            List<MyObject> list2 = new ArrayList<>(len);
            for (int i = 0; i < len; i++) {
                keys[i] = dis.readUTF();
            }
            for (int i = 0; i < len; i++) {
                vals[i] = dis.readUTF();
            }
            for (int i = 0; i < len; i++) {
                list2.add(new MyObject(keys[i], vals[i], dis.readInt()));
            }
            dis.close();
        }
        long time = System.nanoTime() - start;
        long size = file.length();
        System.out.printf("Compact Custom serialization uses %,d bytes and too %.3f seconds.%n",
                size, time / 1e9);
    }


    static class MyObject implements Serializable {
        private String key;
        private String val;
        private int num;

        MyObject(String a, String b, int c) {
            this.key = a;
            this.val = b;
            this.num = c;
        }

        MyObject(DataInput in) throws IOException {
            key = in.readUTF();
            val = in.readUTF();
            num = in.readInt();
        }

        public void writeTo(DataOutput out) throws IOException {
            out.writeUTF(key);
            out.writeUTF(val);
            out.writeInt(num);
        }

        public static void writeList(File file, List<MyObject> list) throws IOException {
            DataOutputStream dos = new DataOutputStream(new BufferedOutputStream(
                    new DeflaterOutputStream(new FileOutputStream(file))));
            dos.writeInt(list.size());
            for (MyObject mo : list) {
                mo.writeTo(dos);
            }
            dos.close();
        }

        public static List<MyObject> readList(File file) throws IOException {
            DataInputStream dis = new DataInputStream(new BufferedInputStream(
                    new InflaterInputStream(new FileInputStream(file))));
            int len = dis.readInt();
            List<MyObject> list = new ArrayList<>(len);
            for (int i = 0; i < len; i++) {
                list.add(new MyObject(dis));
            }
            dis.close();
            return list;
        }
    }
}

终于打印出来了

Java serialization uses 61,168 bytes and too 0.061 seconds.
Faster Custom serialization uses 62,519 bytes and too 0.024 seconds.
Compact Custom serialization uses 68,225 bytes and too 0.020 seconds.

如您所见,我尝试使文件更紧凑,而不是使其更快,这是您应该测试性能改进的一个很好的例子。

于 2013-10-05T16:42:23.380 回答
0

考虑使用快速序列化。它在源代码级别与 JDK 序列化兼容,并减少了臃肿。此外,它击败了大多数手工制作的“可外部化”序列化,因为它不仅是 JDK 序列化实现本身,而且库存 JDK 的低效输入/输出流实现也会损害性能。

http://code.google.com/p/fast-serialization/

于 2013-10-08T11:25:40.810 回答