Immutable Data Structures

Agenda for today

  • 10:30 - 11:30  Lecture: Functional Programming
  • 11:30 - 1:00    Workshop: Immutable Linked List
  • 1:00 - 2:30      Break
  • 2:30 - 3:00      Review: Immutable Linked List
  • 3:00 - 4:00      Lecture: Git Internals
  • 4:00 - 5:30      Workshop: FVS
  • 5:30 - 6:00      Review: FVS

Programming paradigms

  • Programming paradigms describe a particular approach to problem solving
  • Some paradigms you may have heard of:
    • Procedural, Imperative, Object oriented, Declarative, Functional
  • Paradigms/styles/techniques/patterns - these are all just tools for solving particular problems
    • Some tools are better suited for certain jobs

OO programming

  • Object oriented programming: main unit of computation is an object, which has attributes and methods that are defined in a blueprint called a class
  • Great for modeling real world entities
function Dog (name, breed) {
  this.name = name;
  this.breed = breed;
}

Dog.prototype.bark = function () {
  console.log('Arf, my name is ', this.name);
};

const cody = new Dog('Cody', 'pug');
cody.bark();

Functional programming

  • Functional programming: main unit of computation is a function, which must produce the same output for the same input
  • Avoids mutation of state and side effects
  • Great for modeling transforms of data (ex. map and filter operations)
const arr = [1, 2, 3];

const newArray = arr
  .map(item => item * 3)
  .filter(item => item % 2 === 0);

console.log(newArray) // [6]

Other Features of Functional programming

  • Pure functions
  • "First class" functions
  • Higher order functions
  • Function composition
  • Emphasis on recursion
  • Immutability

Pure Functions

  • Same input === same output
  • No side effects
    • Logging to the console, file IO, network requests...you know, pretty useful stuff
    • We need effects to write useful applications - the trick is to know when to use a pure function vs. an impure function

First class functions

  • Functions can be treated the same way as any other value (can be stored in a variable, passed around, etc)

Higher order functions

  • Functions that return functions, and/or take a function as an argument

Composition

  • Two or more functions can be composed together into a new function
  • The output of one becomes the input of the other
  • Opens the door to reasoning about your functions using established mathematical laws
  • Requires using unary functions (functions that accept one and only one argument)
const composedFunction = compose(f, g)

composedFunction(x) === f(g(x))

Recursion

  • Replaces for/while with recursion
  • Any problem that can be solved iteratively can be solved recursively
function factorial (x) {
  let result = x;
  for (let i = x - 1; i > 1; i--) {
    result *= i;
  }
  return result
}
function factorial(x) {
  return x === 1 ? x : x * factorial(x - 1);
}

JavaScript & Fn Programming

  • Some languages (like Java) were designed to adhere very strictly to the object-oriented paradigm, so until very recently, functional programming has been extremely difficult
  • Other languages, like Haskell, Erlang and Elixer are purely functional
  • JavaScript supports both! This is great because that means you can be flexible, but challenging because you have to be careful not to "break the rules"

Immutability & JS

  • Immutability: State cannot be modified after creation
  • In JavaScript, primitive types like numbers, strings and booleans are immutable
  • Objects, including Functions and Arrays, are mutable
    • You can push, pop, attach properties and methods, etc.

Numbers, strings, booleans

Objects, arrays

let obj = {a: 'someData'};

lolFunction(obj);

console.log(obj) // what are we going to get?

Immutable Data Structures

  • What if, instead of mutating an array, using push or pop behaved like map or filter, and returned a new array?
  • What if assigning a new key-value pair in an object was an operation that returned a new copy of the object?
let obj = {a: 'someData'};

let newObj = lolFunction(obj);

// now I know that if I want to use obj, I'm using obj
// and if I want to use newObj, I'm getting newObj

Advantages of immutable data

  • Predictability, of course
  • Easier to debug, since you could have physical access to the history of state changes
    • "Undo" becomes a trivial operation
  • Gain the ability to treat a collection of data like a value

Performance and Memory

  • Memory
    • Structural sharing: re-use existing nodes via references - don't just copy/paste all the time
    • Some dependency on garbage collection in your environment
  • Performance
    • Performance benefits can depend on what you're trying to do, and is often comparable to mutable data structures

Immutable Linked List Operations & Structural Sharing

Prepend

Prepend

Prepend

Append

Append

Append

Insert

Insert

Insert

This is structural sharing!

Copy (Mutation)

0x70

0x71

0x72

const arr1 = ["green", "pink", "blue"];

arr1

Copy (Mutation)

0x70

0x71

0x72

0x73

0x74

0x75

arr1

arr2

Copy (Mutation)

0x70

0x71

0x72

0x73

0x74

0x75

var arr1 = ["green", "pink", "blue"];
var arr2 = ["green", "pink", "blue"];

arr1 === arr2 // false

// remember that === between two objects compares 
// address in memory

Copy (Immutable)

0x70

0x71

0x72

const arr1 = [1, 2, 3];

Copy (Immutable)

0x70

0x71

0x72

..........um, we're done?

const arr1 = [1, 2, 3];
const arr2 = [1, 2, 3];

arr1 === arr2 // true

// we could just re-use the memory 
// we've already allocated

Mutable v Immutable

  • Mutable single-linked list (assuming front, back and insertion nodes are known):
    • ​Prepend: O(1)
    • Append: O(1)
    • Insert: O(1)
    • Find: O(n)
    • Copy: O(n)
  • Immutable single-linked list (assuming front, back and insertion nodes are known):
    • ​Prepend: O(1)
    • Append: O(n)
    • Insert: O(n)
    • Find: O(n)
    • Copy: O(1)

Git & Immutability

Git

  • Each commit is a small text file that references a directory (also called a "tree" - don't get confused), and the commit that came before
  • Each "tree" (directory) references the files in that directory, or the subdirectories (other "trees")
  • Each file is identified by a hash of its contents
  • By using structural sharing, Git creates the minimum number of objects necessary
    • Git is an immutable linked structure!

Commit B

(message: "I just commited some changes!")

Tree (/my-prj)

Tree (my-prj/js)

Tree (my-prj/css)

Blob (my-prj/js/app.js)

Blob (my-prj/js/utils.js)

Blob (my-prj/css/style.css)

Commit C 

Commit A 

/my-prj
  /js
    app.js
    utils.js
  /css
    style.css

Commit B

(message: "I just commited some changes!")

Tree (/my-prj)

Tree (my-prj/js)

Tree (my-prj/css)

Blob (my-prj/js/app.js)

Blob (my-prj/js/utils.js)

Blob

(my-prj/css/style.css)

Commit C

(message: "I just changed style.css!") 

/my-prj
  /js
    app.js
    utils.js
  /css
    style.css (CHANGED!)

Tree` (/my-prj)

Tree` (my-prj/css)

Blob`

(prj/css/style.css)

Content Addressable

  • How does git know when to create a new node?
  • Git "objects" (commits, trees and file "blobs") are content addressable
    • Identified using a hash of its contents
    • Objects with the same content will have the same hash
    • Collision is highly unlikely
  • Remember: hashing algorithms use modular math to return a value that uniquely identifies the input, no matter how long it is

SHA1

  • "Secure hashing algorithm 1"
  • The primary hashing algorithim used by Git
  • Exposed in Node via the crypto module
    • Look out! Feb 23 2017: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Workshop

Your task

  • Create an immutable linked list
  • Specs: 1-functional-linked-list/list.spec.js
  • Work in 1-functional-linked-list/list.js
  • Hint: I highly recommend recursion

 

git clone https://github.com/FullstackAcademy/fvs

Immutables

By Tom Kelly

Immutables

  • 1,554