I am trying to understand BSON
via http://bsonspec.org/#/specification, but still some questions remain.
let's take an example from the web site above:
{"hello": "world"} → "\x16\x00\x00\x00\x02hello\x00\x06\x00\x00\x00world\x00\x00"
Question 1
in the above example, for the encoded bytes results, the double quotes actually are not part of the results, right?
Question 2
I understand that the first 4 bytes
\x16\x00\x00\x00
is the size of the whole BSON doc.
And it is little endian
format. But why? Why not take big endian
?
Question 3
How comes the size of the example doc being \x16
, i.e. 22
?
Question 4
Normally, if I want to encode the doc by myself, how do I calculate the size of the doc? I think my trouble majorly is how to decide the size of UTF-8
string?
Let's take another example:
{"BSON": ["awesome", 5.05, 1986]}
→
"\x31\x00\x00\x00\x04BSON\x00\x26\x00\x00\x00\x020\x00\x08\x00\x00
\x00awesome\x00\x011\x00\x33\x33\x33\x33\x33\x33\x14\x40\x102\x00\xc2\x07\x00\x00
\x00\x00"
Question 5
In this example, there is an array. according to the specification, for array, it is actually a list of {key, value}
pairs, whereas the key is 0
, 1
, etc. My question is so the 0
, 1
here are strings too, right?