Speeding Up The Distributed File System
Jowanza Joseph
@jowanza
Agenda
About Me
A Brief History of the Distributed File System
Back Breaking Scale
Alluxio
Demo Spark + Alluxio
QuestionsÂ
About Me
Senior Software Engineer One Click Retail
Scala / Java , Distributed Systems
Husband / Father
The Need For Distributed Storage
Data Powers Applications
Slow Data Is Hard Too Action
Streaming Is Valuable
Data Storage Is Expensive
Disk Based Analytics Is Expensive
The Hero
Reliability
Scalability
Distributed
Fault Tolerant
Operationally Difficult
Why Hadoop?
Hadoop Ecosystem
SQL On Hadoop
Lowest Common Denominator
What is Alluxio
Architecture
Works With
Demo
Reading Files From s3
Multiple Test Performance
Cache Performance
Mount Alluxio
Cache Performance
Questions?
Made with Slides.com