我试图反序列化由特定网站上的脚本发布的消息。我查看了脚本并注意到它使用了 protobufjs。消息结构是从服务器的 JSON 文件加载的,看起来像这样字符串化:
{
"nested":{
"InteractionCollection":{
"fields":{
"interactions":{
"rule":"repeated",
"type":"Interaction",
"id":1
},
"mouseMovements":{
"rule":"repeated",
"type":"MouseMovement",
"id":2
},
"url":{
"type":"string",
"id":3
},
"flags":{
"rule":"repeated",
"type":"Flag",
"id":4
}
},
"nested":{
"Interaction":{
"fields":{
"type":{
"type":"string",
"id":1
},
"time":{
"type":"int64",
"id":2
},
"elementId":{
"type":"string",
"id":3
},
"elementType":{
"type":"string",
"id":4
},
"additionalInfo":{
"type":"string",
"id":5
}
}
},
"MouseMovement":{
"fields":{
"time":{
"type":"int64",
"id":1
},
"x":{
"type":"int64",
"id":2
},
"y":{
"type":"int64",
"id":3
},
"wx":{
"type":"int64",
"id":4
},
"wy":{
"type":"int64",
"id":5
}
}
},
"Flag":{
"fields":{
"time":{
"type":"int64",
"id":1
},
"name":{
"type":"string",
"id":2
}
}
}
}
}
}
然后它创建一个“InteractionCollection”消息的新实例并将新的交互、mouseMovements 和标志推送给它。
instance = message.create({
url : "someurl",
interactions : [],
mouseMovements : []
})
var some_interaction = interaction_message.create({
time : Date.now(),
elementId : "idstring",
elementType : "typestring",
type : "anotherstring",
additionalInfo : "infostring"
});
instance.interactions.push(some_interaction);
在脚本结束时,它会将数据以序列化格式发布到服务器,如下所示:
navigator.sendBeacon("someserverpath", message.encode(instance).finish());
我正在使用 C#,所以我通过 NuGet (Google.Protobuf) 安装了官方的 Google Protocol Buffer Package,并创建了一个 proto 文件来复制上面的 json 描述符:
syntax = "proto3";
option csharp_namespace = "Proto2"; //C# Project Name: Proto2
message InteractionCollection {
repeated Interaction interactions = 1;
repeated MouseMovement mouse_movements = 2;
string url = 3;
repeated Flag flags = 4;
message Flag {
int64 time = 1;
string name = 2;
}
message Interaction {
string type = 1;
int64 time = 2;
string element_id = 3;
string element_type = 4;
string additional_info = 5;
}
message MouseMovement {
int64 time = 1;
int64 x = 2;
int64 y = 3;
int64 wx = 4;
int64 wy = 5;
}
}
然后我使用 NuGet 包附带的 protoc.exe 编译了 proto 文件,并将生成的类包含在我的项目中。然后我创建了一个测试 InteractionCollection 并将其序列化:
InteractionCollection collection = new InteractionCollection
{
Url = "/",
Interactions = {
new Interaction{ Time = 1508602241363, ElementId = "DOM", ElementType = "DOM", Type = "tohru", AdditionalInfo = "" },
new Interaction{ Time = 1508602243075, Type = "focus", AdditionalInfo = "" },
},
};
using (var output = File.Create("csharp_out.dat"))
{
collection.WriteTo(output);
}
在网站上,它序列化了相同的消息。
{
"interactions":[
{"type":"tohru","time":"1508602241363","elementId":"DOM","elementType":"DOM","additionalInfo":""},
{"type":"focus","time":"1508602243075","additionalInfo":""}
],
"url":"/"
}
但是,我从 C# 项目中获得的数据与网站发布的数据略有不同。我的原始文件是错误的还是有其他原因。显然,这意味着我也不能反序列化来自网站的数据。C#:
tohruÓÂÍýó+DOM"DOM
focusƒÐÍýó+/
或者
\x0A\x18\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x0A\x0E\x0A\x05\x66\x6F\x63\x75\x73\x10\xC6\x92\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x01\x2F
JS:
tohruÓÂÍýó+DOM"DOM*
focusÐÍýó+*/
或者
\x1A\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x2A\x0A\x10\x0A\x05\x66\x6F\x63\x75\x73\x10\xC2\x83\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x2A\x1A\x01\x2F
我很抱歉这是一个这么长的问题,但我已经摆弄了一个多星期,我真的无法弄清楚问题是什么。感谢您的耐心等待!
更新:
经过一些测试,在本地运行 protobufjs,我注意到 C# 版本的 Protobuf 将空字符串解释为 null,因此将整个字段排除在外(就像您忽略可选字段时一样),而 protobufjs 将其序列化为:
\x12\x00
我还尝试使用 protobuf-net 而不是 Google 版本进行相同的测试,并将空字符串序列化为:
\x12\x20
有没有办法将此行为更改为 protobufjs 使用的行为?