4

我正在寻找一种可以在 Perl 和 Python 中使用的快速(xml 太慢)序列化方法。

不幸的是,我不能使用 JSON(和许多其他的),因为它总是将 dict 键的类型从整数更改为字符串。我需要保留密钥类型的序列化/反序列化。

Python:

>>> import json
>>> dict_before = {1:'one', 20: 'twenty'}
>>> data = json.dumps(dict_before)
>>> dict_after = json.loads(data)

>>> dict_before
{1: 'one', 20: 'twenty'}            #integer keys
>>> dict_after
{u'1': u'one', u'20': u'twenty'}    #string keys

欢迎任何建议。

4

3 回答 3

4

您可以使用 yaml。

>>> import yaml
>>> dict_before = {1:'one', 20: 'twenty'}
>>> data = yaml.safe_dump(dict_before)
>>> dict_after = yaml.safe_load(data)
>>> dict_after
{1: 'one', 20: 'twenty'}

我有类似的问题。我想在 Perl 和 Python 中共享一个配置文件,我不得不使用 yaml。

您可以使用以下命令在 python 中安装 yaml 模块:

pip install PyYAML

虽然,整数键将在 perl => Perl 哈希键的合法值中转换为字符串

于 2013-08-22T12:29:42.827 回答
2

Mu. You have started with the wrong premise.

Perl does not have a meaningful type system that distinguishes between numbers and strings. Any given value can be both. It can't be determined using only the Perl language whether a given value is considered to be a number only (although you can use modules like Devel::Peek). It is utterly impossible to know what type a given value originally was.

my $x = 1;     # an integer (IV), right?
say "x = $x";  # not any more! It's a PVIV now (string and integer)

Furthermore, in a hash map (“dictionary”), the key type is always coerced to a string. In arrays, the key is always coerced to an integer. Other types can only be faked.

This is wonderful when parsing text, but of course introduces endless pain when serializing a data structure. JSON maps perfectly to Perl data structures, so I suggest you stick to that (or YAML, as it is a superset of JSON) to protect yourself from the delusion that a serialization could infer information that isn't possibly there.

What do we take from this?

  • If interop is important, refrain from using creative dictionary types in Python.

  • You can always encode type information in the serialization should it really be important (hint: it probably isn't): {"type":"interger dict", "data":{"1":"foo","2":"bar"}}

  • It would also be premature to dismiss XML as too slow. See this recent article, although I disagree with the methods, and it restricts itself to JS (last week's HN thread for perspective).

    If it is native, it will probably be fast enough, so obviously don't use any pure-Perl or pure-Python implementations. This also holds for JSON- and YAML- and whatnot -parsers.

于 2013-08-22T19:22:08.277 回答
1

尝试msgpack。它紧凑而快速。有一个perl 实现,但我从未使用过它。python impl虽然有效:

>>> import msgpack
>>> x=msgpack.dumps({1:'aaa',2:'bbb'})
>>> x
'\x82\x01\xa3aaa\x02\xa3bbb'
>>> len(x)
11
>>> print msgpack.loads(x)
{1: 'aaa', 2: 'bbb'}
>>> 
于 2013-08-22T15:53:00.837 回答