Data Pipelines With Apache Spark and Couchbase
Jowanza Joseph
@jowanza
www.jowanza.com
Agenda
About Me
~ 1 Min
What are data pipelines? ~
3 Min
Challenges ~
3 Min
Couchbase ~
3 Min
Spark ~
3 Min
Marshalling ~
2 Min
Demo ~
15 Min
Questions
About Me
Software Engineer at One Click Retail
Spark, Scala / Java
Apache Spark Field Book
Cycling / Golf
Father
What are data pipelines?
Challenges
Volume
Drowning
Latency
Swiss Army Knife
Flexibility
Volume
Drowning
Latency
Swiss Army Knife
Flexibility
Cap Theorem
Why Spark?
Glue Framework
Type Safety
Distributed
Fault Tolerant
Why Couchbase?
Batch & Stream
Full-text search
Flexible Schema
N1QL
Easy
Distribution
Minimal
Operational Overhead
Marshall
Demo
Made with Slides.com