protocol-buffers - 为什么 ProtoBuf 在第一次调用时这么慢，但在循环内部却非常快？

Question

灵感来自这个问题。我创建了一个小型基准程序来比较 ProtoBuf、BinaryFormatter 和 Json.NET。基准测试本身是一个基于https://github.com/sidshetye/SerializersCompare的小型控制台。随意添加/改进，添加新的序列化程序非常简单。无论如何，我的结果是：

        Binary Formatter         ProtoBuf          Json.NET     ServiceStackJson   ServiceStackJSV
 Loop     Size:512 bytes    Size:99 bytes    Size:205 bytes      Size:205 bytes     Size:181 bytes
    1         16.1242 ms      151.6354 ms       277.2085 ms         129.8321 ms        146.3547 ms
    2          0.0673 ms        0.0349 ms         0.0727 ms           0.0343 ms          0.0370 ms
    4          0.0292 ms        0.0085 ms         0.0303 ms           0.0145 ms          0.0148 ms
    8          0.0255 ms        0.0069 ms         0.0017 ms           0.0216 ms          0.0129 ms
   16          0.0011 ms        0.0064 ms         0.0282 ms           0.0114 ms          0.0120 ms
   32          0.0164 ms        0.0061 ms         0.0334 ms           0.0112 ms          0.0120 ms
   64          0.0347 ms        0.0073 ms         0.0296 ms           0.0121 ms          0.0013 ms
  128          0.0312 ms        0.0058 ms         0.0266 ms           0.0062 ms          0.0117 ms
  256          0.0256 ms        0.0097 ms         0.0448 ms           0.0087 ms          0.0116 ms
  512          0.0261 ms        0.0058 ms         0.0307 ms           0.0127 ms          0.0116 ms
 1024          0.0258 ms        0.0057 ms         0.0309 ms           0.0113 ms          0.0122 ms
 2048          0.0257 ms        0.0059 ms         0.0297 ms           0.0125 ms          0.0121 ms
 4096          0.0247 ms        0.0060 ms         0.0290 ms           0.0119 ms          0.0120 ms
 8192          0.0247 ms        0.0060 ms         0.0286 ms           0.0115 ms          0.0121 ms

免责声明：

上面的结果来自 Windows VM - 与裸机操作系统相比，非常小的间隔的秒表/计时器值可能不是 100% 准确。所以忽略上表中的超低值。
对于 ServiceStack，Json 和 JSV 得分取自两次单独的运行。由于它们共享相同的底层 ServiceStack 库，因此一个接一个地运行会影响下一次运行的“冷启动”1 循环分数（它的“热启动”速度很快）

BinaryFormatter 是最大的，但对于单个序列化 => 反序列化循环来说也是最快的。但是，一旦我们围绕序列化 => 反序列化代码进行紧密循环，ProtoBuf 就会超级快。

问题#1：为什么 ProtoBuf 对于单个序列化 => 反序列化循环要慢得多？

问题2：从实际的角度来看，我们可以做些什么来克服“冷启动”？通过它运行至少一个对象（任何类型）？通过它运行每个（关键）对象类型？

score 11 · Accepted Answer

Question#1: Why is ProtoBuf that much slower for a single serialization => deserialization loop?

Because it does a metric ton of work to analyse the model and prepare the strategy; I've spent a lot of time making the generated strategy be as insanely fast as possible, but it could be that I've skimped on optimizations in the meta-programming layer. I'm happy to add that as an item to look at, to reduce the time on a first pass. Of course, on the other hand the meta-programming layer is still twice as fast as Json.NET's equivalent pre-processing ;p

Question#2: From a practical perspective, what can we do to get past that "cold start"? Run at least one object (of any time) through it? Run every (critical) object type through it?

Various options:

use the "precompile" tool as part of your build process, to generate the compiled serializer as a separate fully-static compiled dll that you can reference and use like normal: exactly zero meta-programming then happens
explicitly tell the model about the "root" types at startup, and store the output of Compile()
```
static TypeModel serializer;
...
RuntimeTypeModel.Default.Add(typeof(Foo), true);
RuntimeTypeModel.Default.Add(typeof(Bar), true);
serializer = RuntimeTypeModel.Default.Compile();
```
(the Compile() method will analyse from the root-types, adding in any additional types needed as it goes, returning a compiled generated instance)
explicitly tell the model about the "root" types at startup, and call CompileInPlace() "a few times"; CompileInPlace() will not full expand the model - but calling it a few times should cover most bases, since compiling one layer will bring other types into the model
```
RuntimeTypeModel.Default.Add(typeof(Foo), true);
RuntimeTypeModel.Default.Add(typeof(Bar), true);
for(int i = 0 ; i < 5 ; i++) {
    RuntimeTypeModel.Default.CompileInPlace();
}
```

Separately, I should probably:

add a method to fully expand a model for the CompileInPlace scenario
spend some time optimizing the meta-programming layer

Final thought: the main difference between Compile and CompileInPlace here will be what happens if you've forgotten to add some types; CompileInPlace works against the existing model, so you can still add new types (implicitly or explicitly) later, and it will "just work"; Compile is more rigid: once you've generated a type via that, it is fixed and can handle only the types it could deduce at the time it was compiled.

protocol-buffers - 为什么 ProtoBuf 在第一次调用时这么慢，但在循环内部却非常快？

1 回答 1

Related

Reference