So BSON is JSON serialized right?
{"hello": "world"}
→ "\x16\x00\x00\x00\x02hello\x00 \x06\x00\x00\x00world\x00\x00"
But why is it called Binary Json? What does binary stands for?
I always tend to associate binary with 10101010101. But the BSON serialization format above wasn't in 101010101010 form.
Could someone explain for me what the Binary here means so I understand why it's called Binary JSON?
It's binary as opposed to text. Whereas JSON is human-readable text, BSON is binary data (just bytes). You could write it out as 1001010 etc, but it's more common to show each byte at a time (so \x16 is just hex 16, i.e. the decimal byte 22). Basically "binary" here is used to compare it with textual data, not to say that it's actually base 2 in particular.
This means that you can only use BSON in situations where you can transport arbitrary binary data. For example, if you wanted to embed BSON in an XML document (for whatever reason!) you'd have to base64 encode it first, because XML is a text-based representation.
Binary is really a misnomer, since everything on your computer is "binary" at some level. Binary, when it comes to file or network stream formats, means not-easily-human-understandable. It also tends to be compact.
Examples of textual or "human readable" (human understandable) file and stream formats:
Examples of "binary" file and stream formats:
- jpeg
- mp3
- avi
- dll
- Quake (a videogame) network protocol
The thing of most note here is that human understandable formats need a lot less explanation if you simply crack them open and start reading. Binary file formats might need whole books to explain :)
A format isn't necessarily purely "binary" or purely human understandable, though. For example, you could probably understand a series of single digit numbers with no spaces, which represent an array of single digit numbers. You probably couldn't understand a series of 48 numbers (with no spaces), which represent x, y, and z values for 16 3d vertices, even though you can "read" them. Also, there is Skeet's example of encoded "binary" data, especially if it is embedded in a more human understandable format.
The reason it is called 'binary' is explained already: basically, it is not textual, hence unix-style distinction (binary vs text files).
But JSON part is odd as well, since BSON is NOT JSON -- it's neither subset nor superset. It has many more datatypes, so it is sort of superset; but it also does not support all legal JSON because of limitations on things like property name and string value length limitations.