我正在尝试使用 Protobuf-net 序列化一个包含非常大的复合对象图列表(约 200000 个节点或更多)的对象。基本上我想要实现的是将完整的对象保存到一个文件中,尽可能快且尽可能紧凑。
我的问题是在尝试序列化对象时出现内存不足异常。在我的机器上,当文件大小约为 1.5GB 时会引发异常。我正在运行一个 64 位进程并使用 StreamWriter 作为 protobuf-net 的输入。由于我直接写入文件,我怀疑 protobuf-net 中发生了某种缓冲,导致异常。我曾尝试使用 DataFormat = DataFormat.Group 属性,但到目前为止没有运气。
我可以通过将列表中的每个组合序列化到一个单独的文件来避免异常,但如果可能的话,我更愿意一次性完成所有操作。
我做错了什么还是根本无法实现我想要的?
说明问题的代码:
class Program
{
static void Main(string[] args)
{
int numberOfTrees = 250;
int nodesPrTree = 200000;
var trees = CreateTrees(numberOfTrees, nodesPrTree);
var forest = new Forest(trees);
using (var writer = new StreamWriter("model.bin"))
{
Serializer.Serialize(writer.BaseStream, forest);
}
Console.ReadLine();
}
private static Tree[] CreateTrees(int numberOfTrees, int nodesPrTree)
{
var trees = new Tree[numberOfTrees];
for (int i = 0; i < numberOfTrees; i++)
{
var root = new Node();
CreateTree(root, nodesPrTree, 0);
var binTree = new Tree(root);
trees[i] = binTree;
}
return trees;
}
private static void CreateTree(INode tree, int nodesPrTree, int currentNumberOfNodes)
{
Queue<INode> q = new Queue<INode>();
q.Enqueue(tree);
while (q.Count > 0 && currentNumberOfNodes < nodesPrTree)
{
var n = q.Dequeue();
n.Left = new Node();
q.Enqueue(n.Left);
currentNumberOfNodes++;
n.Right = new Node();
q.Enqueue(n.Right);
currentNumberOfNodes++;
}
}
}
[ProtoContract]
[ProtoInclude(1, typeof(Node), DataFormat = DataFormat.Group)]
public interface INode
{
[ProtoMember(2, DataFormat = DataFormat.Group, AsReference = true)]
INode Parent { get; set; }
[ProtoMember(3, DataFormat = DataFormat.Group, AsReference = true)]
INode Left { get; set; }
[ProtoMember(4, DataFormat = DataFormat.Group, AsReference = true)]
INode Right { get; set; }
}
[ProtoContract]
public class Node : INode
{
INode m_parent;
INode m_left;
INode m_right;
public INode Left
{
get
{
return m_left;
}
set
{
m_left = value;
m_left.Parent = null;
m_left.Parent = this;
}
}
public INode Right
{
get
{
return m_right;
}
set
{
m_right = value;
m_right.Parent = null;
m_right.Parent = this;
}
}
public INode Parent
{
get
{
return m_parent;
}
set
{
m_parent = value;
}
}
}
[ProtoContract]
public class Tree
{
[ProtoMember(1, DataFormat = DataFormat.Group)]
public readonly INode Root;
public Tree(INode root)
{
Root = root;
}
}
[ProtoContract]
public class Forest
{
[ProtoMember(1, DataFormat = DataFormat.Group)]
public readonly Tree[] Trees;
public Forest(Tree[] trees)
{
Trees = trees;
}
}
抛出异常时的堆栈跟踪:
at System.Collections.Generic.Dictionary`2.Resize(Int32 newSize, Boolean forceNewHashCodes)
at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
at ProtoBuf.NetObjectCache.AddObjectKey(Object value, Boolean& existing) in NetObjectCache.cs:line 154
at ProtoBuf.BclHelpers.WriteNetObject(Object value, ProtoWriter dest, Int32 key, NetObjectOptions options) BclHelpers.cs:line 500
at proto_5(Object , ProtoWriter )
我正在尝试一种解决方法,使用 SerializeWithLengthPrefix 方法一次将一个树的数组序列化为一个文件。序列化似乎有效 - 我可以看到将列表中的每棵树添加到文件后文件大小增加了。但是,当我尝试反序列化树时,我得到了 Invalid wire-type 异常。当我序列化树时,我正在创建一个新文件,因此该文件应该是无垃圾的 - 除非我正在编写原因垃圾;-)。我的序列化和反序列化方法如下:
using (var writer = new FileStream("model.bin", FileMode.Create))
{
foreach (var tree in trees)
{
Serializer.SerializeWithLengthPrefix(writer, tree, PrefixStyle.Base128);
}
}
using (var reader = new FileStream("model.bin", FileMode.Open))
{
var trees = Serializer.DeserializeWithLengthPrefix<Tree[]>>(reader, PrefixStyle.Base128);
}
我是否以不正确的方式使用该方法?