Introduction
Written by: Igor Korotach
Parallel computing is a type of computation where many calculations or the execution of processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time.
MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.
A MapReduce program is composed of a map procedure, which performs filtering and sorting (such as sorting students by first name into queues, one queue for each name), and a reduce method, which performs a summary operation (such as counting the number of students in each queue, yielding name frequencies).
A MapReduce framework (or system) is usually composed of three operations (or steps):
Excellent tool for parallelizing
Good for high performance engineering
Good for working with real-time stream data
Good for sharding data
Excellent tool for planning
Good for scheduling ETL jobs
Good for working with different data sources
Good for montoring
Presentation link: https://slides.com/emulebest/parallel-computing
Mail: igorkorotach@gmail.com
Telegram: @emulebest