Immutable Data Structures
Agenda for today
- 10:30 - 11:30 Lecture: Functional Programming
- 11:30 - 1:00 Workshop: Immutable Linked List
- 1:00 - 2:30 Break
- 2:30 - 3:00 Review: Immutable Linked List
- 3:00 - 4:00 Lecture: Git Internals
- 4:00 - 5:30 Workshop: FVS
- 5:30 - 6:00 Review: FVS
Programming paradigms
- Programming paradigms describe a particular approach to problem solving
- Some paradigms you may have heard of:
- Procedural, Imperative, Object oriented, Declarative, Functional
- Paradigms/styles/techniques/patterns - these are all just tools for solving particular problems
- Some tools are better suited for certain jobs
OO programming
- Object oriented programming: main unit of computation is an object, which has attributes and methods that are defined in a blueprint called a class
- Great for modeling real world entities
function Dog (name, breed) {
this.name = name;
this.breed = breed;
}
Dog.prototype.bark = function () {
console.log('Arf, my name is ', this.name);
};
const cody = new Dog('Cody', 'pug');
cody.bark();
Functional programming
- Functional programming: main unit of computation is a function, which must produce the same output for the same input
- Avoids mutation of state and side effects
- Great for modeling transforms of data (ex. map and filter operations)
const arr = [1, 2, 3];
const newArray = arr
.map(item => item * 3)
.filter(item => item % 2 === 0);
console.log(newArray) // [6]
Other Features of Functional programming
- Pure functions
- "First class" functions
- Higher order functions
- Function composition
- Emphasis on recursion
- Immutability
Pure Functions
- Same input === same output
- No side effects
- Logging to the console, file IO, network requests...you know, pretty useful stuff
- We need effects to write useful applications - the trick is to know when to use a pure function vs. an impure function
First class functions
- Functions can be treated the same way as any other value (can be stored in a variable, passed around, etc)
Higher order functions
- Functions that return functions, and/or take a function as an argument
Composition
- Two or more functions can be composed together into a new function
- The output of one becomes the input of the other
- Opens the door to reasoning about your functions using established mathematical laws
- Requires using unary functions (functions that accept one and only one argument)
const composedFunction = compose(f, g)
composedFunction(x) === f(g(x))
Recursion
- Replaces for/while with recursion
- Any problem that can be solved iteratively can be solved recursively
function factorial (x) {
let result = x;
for (let i = x - 1; i > 1; i--) {
result *= i;
}
return result
}
function factorial(x) {
return x === 1 ? x : x * factorial(x - 1);
}
JavaScript & Fn Programming
- Some languages (like Java) were designed to adhere very strictly to the object-oriented paradigm, so until very recently, functional programming has been extremely difficult
- Other languages, like Haskell, Erlang and Elixer are purely functional
- JavaScript supports both! This is great because that means you can be flexible, but challenging because you have to be careful not to "break the rules"
Immutability & JS
- Immutability: State cannot be modified after creation
- In JavaScript, primitive types like numbers, strings and booleans are immutable
- Objects, including Functions and Arrays, are mutable
- You can push, pop, attach properties and methods, etc.
Numbers, strings, booleans
Objects, arrays
let obj = {a: 'someData'};
lolFunction(obj);
console.log(obj) // what are we going to get?
Immutable Data Structures
- What if, instead of mutating an array, using push or pop behaved like map or filter, and returned a new array?
- What if assigning a new key-value pair in an object was an operation that returned a new copy of the object?
let obj = {a: 'someData'};
let newObj = lolFunction(obj);
// now I know that if I want to use obj, I'm using obj
// and if I want to use newObj, I'm getting newObj
Advantages of immutable data
- Predictability, of course
- Easier to debug, since you could have physical access to the history of state changes
- "Undo" becomes a trivial operation
- Gain the ability to treat a collection of data like a value
Performance and Memory
- Memory
- Structural sharing: re-use existing nodes via references - don't just copy/paste all the time
- Some dependency on garbage collection in your environment
- Performance
- Performance benefits can depend on what you're trying to do, and is often comparable to mutable data structures
Immutable Linked List Operations & Structural Sharing
Prepend
Prepend
Prepend
Append
Append
Append
Insert
Insert
Insert
This is structural sharing!
Copy (Mutation)
0x70
0x71
0x72
const arr1 = ["green", "pink", "blue"];
arr1
Copy (Mutation)
0x70
0x71
0x72
0x73
0x74
0x75
arr1
arr2
Copy (Mutation)
0x70
0x71
0x72
0x73
0x74
0x75
var arr1 = ["green", "pink", "blue"];
var arr2 = ["green", "pink", "blue"];
arr1 === arr2 // false
// remember that === between two objects compares
// address in memory
Copy (Immutable)
0x70
0x71
0x72
const arr1 = [1, 2, 3];
Copy (Immutable)
0x70
0x71
0x72
..........um, we're done?
const arr1 = [1, 2, 3];
const arr2 = [1, 2, 3];
arr1 === arr2 // true
// we could just re-use the memory
// we've already allocated
Mutable v Immutable
-
Mutable single-linked list (assuming front, back and insertion nodes are known):
- Prepend: O(1)
- Append: O(1)
- Insert: O(1)
- Find: O(n)
- Copy: O(n)
-
Immutable single-linked list (assuming front, back and insertion nodes are known):
- Prepend: O(1)
- Append: O(n)
- Insert: O(n)
- Find: O(n)
- Copy: O(1)
Git & Immutability
Git
- Each commit is a small text file that references a directory (also called a "tree" - don't get confused), and the commit that came before
- Each "tree" (directory) references the files in that directory, or the subdirectories (other "trees")
- Each file is identified by a hash of its contents
- By using structural sharing, Git creates the minimum number of objects necessary
- Git is an immutable linked structure!
Commit B
(message: "I just commited some changes!")
Tree (/my-prj)
Tree (my-prj/js)
Tree (my-prj/css)
Blob (my-prj/js/app.js)
Blob (my-prj/js/utils.js)
Blob (my-prj/css/style.css)
Commit C
Commit A
/my-prj
/js
app.js
utils.js
/css
style.css
Commit B
(message: "I just commited some changes!")
Tree (/my-prj)
Tree (my-prj/js)
Tree (my-prj/css)
Blob (my-prj/js/app.js)
Blob (my-prj/js/utils.js)
Blob
(my-prj/css/style.css)
Commit C
(message: "I just changed style.css!")
/my-prj
/js
app.js
utils.js
/css
style.css (CHANGED!)
Tree` (/my-prj)
Tree` (my-prj/css)
Blob`
(prj/css/style.css)
Content Addressable
- How does git know when to create a new node?
- Git "objects" (commits, trees and file "blobs") are content addressable
- Identified using a hash of its contents
- Objects with the same content will have the same hash
- Collision is highly unlikely
- Remember: hashing algorithms use modular math to return a value that uniquely identifies the input, no matter how long it is
SHA1
- "Secure hashing algorithm 1"
- The primary hashing algorithim used by Git
- Exposed in Node via the crypto module
- Look out! Feb 23 2017: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Workshop
Your task
- Create an immutable linked list
- Specs: 1-functional-linked-list/list.spec.js
- Work in 1-functional-linked-list/list.js
- Hint: I highly recommend recursion
git clone https://github.com/FullstackAcademy/fvs
Immutables
By Tom Kelly
Immutables
- 1,648