0

我试图反序列化由特定网站上的脚本发布的消息。我查看了脚本并注意到它使用了 protobufjs。消息结构是从服务器的 JSON 文件加载的,看起来像这样字符串化:

{
"nested":{
    "InteractionCollection":{
        "fields":{
            "interactions":{
                "rule":"repeated",
                "type":"Interaction",
                "id":1
            },
            "mouseMovements":{
            "rule":"repeated",
                "type":"MouseMovement",
                "id":2
            },
            "url":{
                "type":"string",
                "id":3
            },
            "flags":{
                "rule":"repeated",
                "type":"Flag",
                "id":4
            }
        },
        "nested":{
            "Interaction":{
                "fields":{
                    "type":{
                        "type":"string",
                        "id":1
                    },
                    "time":{
                        "type":"int64",
                        "id":2
                    },
                    "elementId":{
                        "type":"string",
                        "id":3
                    },
                    "elementType":{
                        "type":"string",
                        "id":4
                    },
                    "additionalInfo":{
                        "type":"string",
                        "id":5
                    }
                }
            },
            "MouseMovement":{
                "fields":{
                    "time":{
                        "type":"int64",
                        "id":1
                    },
                    "x":{
                        "type":"int64",
                        "id":2
                    },
                    "y":{
                        "type":"int64",
                        "id":3
                    },
                    "wx":{
                        "type":"int64",
                        "id":4
                    },
                    "wy":{
                        "type":"int64",
                        "id":5
                    }
                }
            },
            "Flag":{
                "fields":{
                    "time":{
                        "type":"int64",
                        "id":1
                    },
                    "name":{
                        "type":"string",
                        "id":2
                    }
                }
            }
        }
    }
}

然后它创建一个“InteractionCollection”消息的新实例并将新的交互、mouseMovements 和标志推送给它。

instance = message.create({
    url : "someurl",
    interactions : [],
    mouseMovements : []
})

var some_interaction = interaction_message.create({
    time : Date.now(),
    elementId : "idstring",
    elementType : "typestring",
    type : "anotherstring",
    additionalInfo : "infostring"
});
instance.interactions.push(some_interaction);

在脚本结束时,它会将数据以序列化格式发布到服务器,如下所示:

navigator.sendBeacon("someserverpath", message.encode(instance).finish());

我正在使用 C#,所以我通过 NuGet (Google.Protobuf) 安装了官方的 Google Protocol Buffer Package,并创建了一个 proto 文件来复制上面的 json 描述符:

syntax = "proto3";
option csharp_namespace = "Proto2"; //C# Project Name: Proto2

message InteractionCollection {
    repeated Interaction interactions = 1;
    repeated MouseMovement mouse_movements = 2;
    string url = 3;
    repeated Flag flags = 4;

    message Flag {
        int64 time = 1;
        string name = 2;
    }

    message Interaction {
        string type = 1;
        int64 time = 2;
        string element_id = 3;
        string element_type = 4;
        string additional_info = 5;
    }

    message MouseMovement {
        int64 time = 1;
        int64 x = 2;
        int64 y = 3;
        int64 wx = 4;
        int64 wy = 5;
    }
}

然后我使用 NuGet 包附带的 protoc.exe 编译了 proto 文件,并将生成的类包含在我的项目中。然后我创建了一个测试 InteractionCollection 并将其序列化:

InteractionCollection collection = new InteractionCollection
{
    Url = "/",
    Interactions = {
    new Interaction{ Time = 1508602241363, ElementId = "DOM", ElementType = "DOM", Type = "tohru", AdditionalInfo = "" },
    new Interaction{ Time = 1508602243075, Type = "focus", AdditionalInfo = "" },
    },
};
using (var output = File.Create("csharp_out.dat"))
{
    collection.WriteTo(output);
}

在网站上,它序列化了相同的消息。

{
"interactions":[
    {"type":"tohru","time":"1508602241363","elementId":"DOM","elementType":"DOM","additionalInfo":""},
    {"type":"focus","time":"1508602243075","additionalInfo":""}
],
"url":"/"
}

但是,我从 C# 项目中获得的数据与网站发布的数据略有不同。我的原始文件是错误的还是有其他原因。显然,这意味着我也不能反序列化来自网站的数据。C#:

tohruÓÂÍýó+DOM"DOM

focusƒÐÍýó+/

或者

\x0A\x18\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x0A\x0E\x0A\x05\x66\x6F\x63\x75\x73\x10\xC6\x92\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x01\x2F

JS:

tohruÓÂÍýó+DOM"DOM*

focusÐÍýó+*/

或者

\x1A\x0A\x05\x74\x6F\x68\x72\x75\x10\xC3\x93\xC3\x82\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x1A\x03\x44\x4F\x4D\x22\x03\x44\x4F\x4D\x2A\x0A\x10\x0A\x05\x66\x6F\x63\x75\x73\x10\xC2\x83\xC3\x90\xC3\x8D\xC3\xBD\xC3\xB3\x2B\x2A\x1A\x01\x2F

我很抱歉这是一个这么长的问题,但我已经摆弄了一个多星期,我真的无法弄清楚问题是什么。感谢您的耐心等待!

更新:

经过一些测试,在本地运行 protobufjs,我注意到 C# 版本的 Protobuf 将空字符串解释为 null,因此将整个字段排除在外(就像您忽略可选字段时一样),而 protobufjs 将其序列化为:

\x12\x00

我还尝试使用 protobuf-net 而不是 Google 版本进行相同的测试,并将空字符串序列化为:

\x12\x20

有没有办法将此行为更改为 protobufjs 使用的行为?

4

0 回答 0