Immutable Data Structures

Agenda for today

10:30 - 11:30 Lecture: Functional Programming
11:30 - 1:00 Workshop: Immutable Linked List
1:00 - 2:30 Break
2:30 - 3:00 Review: Immutable Linked List
3:00 - 4:00 Lecture: Git Internals
4:00 - 5:30 Workshop: FVS
5:30 - 6:00 Review: FVS

Programming paradigms

Programming paradigms describe a particular approach to problem solving
Some paradigms you may have heard of:
- Procedural, Imperative, Object oriented, Declarative, Functional
Paradigms/styles/techniques/patterns - these are all just tools for solving particular problems
- Some tools are better suited for certain jobs

OO programming

Object oriented programming: main unit of computation is an object, which has attributes and methods that are defined in a blueprint called a class
Great for modeling real world entities

function Dog (name, breed) {
  this.name = name;
  this.breed = breed;
}

Dog.prototype.bark = function () {
  console.log('Arf, my name is ', this.name);
};

const cody = new Dog('Cody', 'pug');
cody.bark();

Functional programming

Functional programming: main unit of computation is a function, which must produce the same output for the same input
Avoids mutation of state and side effects
Great for modeling transforms of data (ex. map and filter operations)

const arr = [1, 2, 3];

const newArray = arr
  .map(item => item * 3)
  .filter(item => item % 2 === 0);

console.log(newArray) // [6]

Other Features of Functional programming

Pure functions
"First class" functions
Higher order functions
Function composition
Emphasis on recursion
Immutability

Pure Functions

Same input === same output
No side effects
- Logging to the console, file IO, network requests...you know, pretty useful stuff
- We need effects to write useful applications - the trick is to know when to use a pure function vs. an impure function

First class functions

Functions can be treated the same way as any other value (can be stored in a variable, passed around, etc)

Higher order functions

Functions that return functions, and/or take a function as an argument

Composition

Two or more functions can be composed together into a new function
The output of one becomes the input of the other
Opens the door to reasoning about your functions using established mathematical laws
Requires using unary functions (functions that accept one and only one argument)

const composedFunction = compose(f, g)

composedFunction(x) === f(g(x))

Recursion

Replaces for/while with recursion
Any problem that can be solved iteratively can be solved recursively

function factorial (x) {
  let result = x;
  for (let i = x - 1; i > 1; i--) {
    result *= i;
  }
  return result
}

function factorial(x) {
  return x === 1 ? x : x * factorial(x - 1);
}

JavaScript & Fn Programming

Some languages (like Java) were designed to adhere very strictly to the object-oriented paradigm, so until very recently, functional programming has been extremely difficult
Other languages, like Haskell, Erlang and Elixer are purely functional
JavaScript supports both! This is great because that means you can be flexible, but challenging because you have to be careful not to "break the rules"

Immutability & JS

Immutability: State cannot be modified after creation
In JavaScript, primitive types like numbers, strings and booleans are immutable
Objects, including Functions and Arrays, are mutable
- You can push, pop, attach properties and methods, etc.

Numbers, strings, booleans

Objects, arrays

let obj = {a: 'someData'};

lolFunction(obj);

console.log(obj) // what are we going to get?

Immutable Data Structures

What if, instead of mutating an array, using push or pop behaved like map or filter, and returned a new array?
What if assigning a new key-value pair in an object was an operation that returned a new copy of the object?

let obj = {a: 'someData'};

let newObj = lolFunction(obj);

// now I know that if I want to use obj, I'm using obj
// and if I want to use newObj, I'm getting newObj

Advantages of immutable data

Predictability, of course
Easier to debug, since you could have physical access to the history of state changes
- "Undo" becomes a trivial operation
Gain the ability to treat a collection of data like a value

Performance and Memory

Memory
- Structural sharing: re-use existing nodes via references - don't just copy/paste all the time
- Some dependency on garbage collection in your environment
Performance
- Performance benefits can depend on what you're trying to do, and is often comparable to mutable data structures

Immutable Linked List Operations & Structural Sharing

Prepend

Append

Insert

This is structural sharing!

Copy (Mutation)

0x70

0x71

0x72

const arr1 = ["green", "pink", "blue"];

arr1

Copy (Mutation)

0x70

0x71

0x72

0x73

0x74

0x75

arr1

arr2

Copy (Mutation)

0x70

0x71

0x72

0x73

0x74

0x75

var arr1 = ["green", "pink", "blue"];
var arr2 = ["green", "pink", "blue"];

arr1 === arr2 // false

// remember that === between two objects compares 
// address in memory

Copy (Immutable)

0x70

0x71

0x72

const arr1 = [1, 2, 3];

Copy (Immutable)

0x70

0x71

0x72

..........um, we're done?

const arr1 = [1, 2, 3];
const arr2 = [1, 2, 3];

arr1 === arr2 // true

// we could just re-use the memory 
// we've already allocated

Mutable v Immutable

Mutable single-linked list (assuming front, back and insertion nodes are known):
- Prepend: O(1)
- Append: O(1)
- Insert: O(1)
- Find: O(n)
- Copy: O(n)
Immutable single-linked list (assuming front, back and insertion nodes are known):
- Prepend: O(1)
- Append: O(n)
- Insert: O(n)
- Find: O(n)
- Copy: O(1)

Git & Immutability

Git

Each commit is a small text file that references a directory (also called a "tree" - don't get confused), and the commit that came before

Each "tree" (directory) references the files in that directory, or the subdirectories (other "trees")
Each file is identified by a hash of its contents
By using structural sharing, Git creates the minimum number of objects necessary
- Git is an immutable linked structure!

Commit B

(message: "I just commited some changes!")

Tree (/my-prj)

Tree (my-prj/js)

Tree (my-prj/css)

Blob (my-prj/js/app.js)

Blob (my-prj/js/utils.js)

Blob (my-prj/css/style.css)

Commit C

Commit A

/my-prj
  /js
    app.js
    utils.js
  /css
    style.css

Commit B

(message: "I just commited some changes!")

Tree (/my-prj)

Tree (my-prj/js)

Tree (my-prj/css)

Blob (my-prj/js/app.js)

Blob (my-prj/js/utils.js)

Blob

(my-prj/css/style.css)

Commit C

(message: "I just changed style.css!")

/my-prj
  /js
    app.js
    utils.js
  /css
    style.css (CHANGED!)

Tree` (/my-prj)

Tree` (my-prj/css)

Blob`

(prj/css/style.css)

Content Addressable

How does git know when to create a new node?
Git "objects" (commits, trees and file "blobs") are content addressable
- Identified using a hash of its contents
- Objects with the same content will have the same hash
- Collision is highly unlikely
Remember: hashing algorithms use modular math to return a value that uniquely identifies the input, no matter how long it is

SHA1

"Secure hashing algorithm 1"
The primary hashing algorithim used by Git
Exposed in Node via the crypto module
- Look out! Feb 23 2017: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Workshop

Your task

Create an immutable linked list
Specs: 1-functional-linked-list/list.spec.js
Work in 1-functional-linked-list/list.js
Hint: I highly recommend recursion

git clone https://github.com/FullstackAcademy/fvs

Immutables

By Tom Kelly

Immutables

1,554

Immutable Data Structures

Agenda for today

Programming paradigms

OO programming

Functional programming

Other Features of Functional programming

Pure Functions

First class functions

Higher order functions

Composition

Recursion

JavaScript & Fn Programming

Immutability & JS

Immutable Data Structures

Advantages of immutable data

Performance and Memory

Immutable Linked List Operations & Structural Sharing

Mutable v Immutable

Git & Immutability

Git

Content Addressable

SHA1

Workshop

Your task

Immutables

More from Tom Kelly