Working with Bytes

Binary Serialization & Elm

Working with Bytes

Introduction to binary serialization
Bytes vs Json
Decoding binary file formats

Working with Bytes

A byte is 8 bits of information

0\ \cdot 2^0\\ 1\ \cdot 2^1\\ 0\ \cdot 2^2\\ 1\ \cdot 2^3\\ 0\ \cdot 2^4\\ 1\ \cdot 2^5\\ 0\ \cdot 2^6\\ 0\ \cdot 2^7\\

= 2 + 8 + 32 = 42

Working with Bytes

A byte is 8 bits of information

42 = 10 \cdot 1 + 2 \cdot 16 = 2A

Bytes vs Json

Why Bytes?

Send fewer bytes over the wire?

Bytes vs Json

Take the number `2019`

Format	Representation	Size
Json	2019	4 bytes
Bytes	00000111 11100011	2 bytes

50% smaller, the gain is larger for larger (i.e. longer) numbers

Bytes vs Json

{ 
    "title": "foo",
    "subject": "spam"
}

json stores a lot of structure: key names, {}, "", []

Bytes vs Json

 3  f o o  4  s p a m 
 33 666f6f 34 7370616d

Field	# of bytes	type
titleLength	1	uint8
title	variable	string
subjectLength	1	uint8
subject	variable	string

Working with Bytes

The Bytes type is a sequence of bytes

Bytes is to binary serialization what String is to json decoders (and parsers in general)

Working with Bytes

The api looks a lot like json

primitives for

integers
floats
string
Bytes

and combinators like map/map2/andThen

Bytes vs Json

So binary serialization is more compact, we should use it right?

Bytes vs Json

type Posix = Posix Int
type alias Item =
    { title : String
    , link : String
    , media : String
    , dateTaken : Posix
    , description : String
    , published : Posix
    , author : String
    , authorId : String
    , tags : List String
    }

Bytes vs Json

Json is 2 times faster!

number of bytes is similar
utf-8 decoding
string slicing

Bytes vs Json

type alias Vec3 =
    { x : Float, y : Float, z : Float }


type alias Triangle =
    { normal : Vec3, p1 : Vec3, p2 : Vec3, p3 : Vec3 }

Here bytes are much faster

parsing performance

Bytes vs Json

Performance gain is in decoding speed, not number of bytes sent
Bytes are faster for numbers
Json is faster for strings

Bytes vs Json

Decoding Binary Files

most file types are binary encoded. We can now use them from elm

examples include zip, tar, png, mp3, otf

Decoding Binary Files

I decode font files because

font files are segmented into tables

tables are still lists of fields, sizes, types

header table stores info about tables

Decoding Binary Files

we have table A and B

table A contains a variable length array, but

table B contains the length of that array

circular dependency

Decoding Binary Files

similar but not the same. With json

we decode in one pass
we decode everything

Decoding Binary Files

solution: decode in 2 passes

Dict String Bytes

create decoders for individual tables

data dependencies are decoder arguments

Conclusion

Bytes create many new possibilities

efficiently load numerical data from the backend
read, vizualize and manipulate new types of data stored in binary files