Svetlana Filimonova, 20 August 2015
“data is a primary challenge: the quantity of data, the complexity of data, or the speed at which it is changing”
Excerpt From: Martin Kleppmann. “Designing Data-Intensive Applications.”
result = dataSource.map(element =>
computeMagic(element)
)
$YOUR_SPARK_PATH/bin/pyspark
$YOUR_SPARK_PATH/bin/spark-shell
or