Binary data in JSON?

jesse on 2008-05-06T01:54:56

JSON doesn't have native support for binary data? Really? Someone, please tell me I'm wrong. (But only if you can back it up with documentation)


well

rjbs on 2008-05-06T02:37:08

Can you fudge it?

{ "bytestring": [ 3,2,12,3,12,31,210,2,18] }

Re:well

jesse on 2008-05-06T02:47:15

I can manually marshal into base64 which will be a lot more compact than something like the array bytestring hack. but then everyone consuming my format needs to come up with the same extension and needs to sniff values for encoding. and that makes me cry.

Sorry to tell you this ...

Ovid on 2008-05-06T08:20:16

Nope, it doesn't. I believe that's because JSON is valid JavaScript syntax and there's no "natural" way to represent binary in a simple string format.

What do you really mean by 'binary data'?

Tim Bunce on 2008-05-06T09:27:40

A JavaScript string is defined as a "sequence of zero or more 16-bit unsigned integer values." (http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf)

and http://json.org/ says "A string is a collection of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. [...]. A string is very much like a C or Java string."

Does that mean JSON "doesn't have native support for binary data"? Depends on your needs I guess (and whether you control the sending and/or receiving ends).

If you have a series of bytes that you want to get from A to B via JSON I'd assume you could treat them as a series of unicode code points, for the sake of encoding. If the destination is JavaScript then they'll end up as a series of 16-bit unicode code points internally, but that shouldn't matter as all the code points are 256. Basically think in terms of the logical "sequence of integers that happen to be 256" rather than the physical "sequence of adjacent 8 bit bytes".

My experience with JSON & JavaScript is minimal (currently) and I've not actually needed to do this yet, but I may do soon so I'm keen for someone to point out any flaws in my thinking.

Re:What do you really mean by 'binary data'?

Tim Bunce on 2008-05-06T09:29:23

"that happen to be 256" should be "that happen to be < 256"

png &amp;

Proclus on 2008-05-06T09:30:32

Not related to JSON directly, but I've been reading the ajaxian the other day, and one guy managed to put the Prototype.js into a compressed PNG and read it back with the canvas methods. The whole 124K lib shrank to 30K:

http://ajaxian.com/archives/compression-using-canvas-and-png

JSON::XS makes it easy

ask on 2008-05-09T05:06:35

No, you have to encode it as ascii (or as latin1 if the other end of the transaction can be told about that).

$ perl -MJSON::XS -MFile::Slurp -e '$g=read_file("add.gif"); print JSON::XS->new->ascii(1)->encode({file=>$g})' > json

$ perl -MJSON::XS -MFile::Slurp -e '$g=read_file("json"); $d = decode_json($g); print $d->{file}' > gif

$ diff add.gif gif
$

Re:JSON::XS makes it easy

jesse on 2008-05-09T12:29:50

Encoding one property as ascii is portable? I didn't think JSON had a way to mark encoding for individual nodes.

Re:JSON::XS makes it easy

ask on 2008-05-10T07:21:00

I didn't read the ECMAScript spec, but according to JSON.org then \u[four hex digits] is a valid way to specify a "character", so yes - I think so. It's still unicode, but you "munge" it to not have any high bits.