Working with Bytes
Binary Serialization & Elm
Working with Bytes
- Introduction to binary serialization
- Bytes vs Json
- Decoding binary file formats
Working with Bytes
A byte is 8 bits of information
Working with Bytes
A byte is 8 bits of information
Bytes vs Json
Why Bytes?
Send fewer bytes over the wire?
Bytes vs Json
Take the number `2019`
Format | Representation | Size |
---|---|---|
Json | 2019 | 4 bytes |
Bytes | 00000111 11100011 | 2 bytes |
50% smaller, the gain is larger for larger (i.e. longer) numbers
Bytes vs Json
{
"title": "foo",
"subject": "spam"
}
json stores a lot of structure: key names, {}, "", []
Bytes vs Json
3 f o o 4 s p a m
33 666f6f 34 7370616d
Field | # of bytes | type |
---|---|---|
titleLength | 1 | uint8 |
title | variable | string |
subjectLength | 1 | uint8 |
subject | variable | string |
Working with Bytes
The Bytes type is a sequence of bytes
Bytes
is to binary serialization what
String
is to json decoders (and parsers in general)
Working with Bytes
The api looks a lot like json
primitives for
- integers
- floats
- string
- Bytes
and combinators like map/map2/andThen
Bytes vs Json
So binary serialization is more compact, we should use it right?
Bytes vs Json
type Posix = Posix Int
type alias Item =
{ title : String
, link : String
, media : String
, dateTaken : Posix
, description : String
, published : Posix
, author : String
, authorId : String
, tags : List String
}
Bytes vs Json
Json is 2 times faster!
- number of bytes is similar
- utf-8 decoding
- string slicing
Bytes vs Json
type alias Vec3 =
{ x : Float, y : Float, z : Float }
type alias Triangle =
{ normal : Vec3, p1 : Vec3, p2 : Vec3, p3 : Vec3 }
Here bytes are much faster
- parsing performance
Bytes vs Json
- Performance gain is in decoding speed, not number of bytes sent
- Bytes are faster for numbers
- Json is faster for strings
Bytes vs Json
Decoding Binary Files
most file types are binary encoded. We can now use them from elm
examples include zip, tar, png, mp3, otf
Decoding Binary Files
I decode font files because
font files are segmented into tables
tables are still lists of fields, sizes, types
header table stores info about tables
Decoding Binary Files
we have table A and B
table A contains a variable length array, but
table B contains the length of that array
circular dependency
Decoding Binary Files
similar but not the same. With json
- we decode in one pass
- we decode everything
Decoding Binary Files
solution: decode in 2 passes
Dict String Bytes
create decoders for individual tables
data dependencies are decoder arguments
Conclusion
Bytes create many new possibilities
- efficiently load numerical data from the backend
- read, vizualize and manipulate new types of data stored in binary files
Working with Bytes
By folkert de vries
Working with Bytes
- 50