Pavel Kudinov
Engineering manager, Data Scientist, Software Architect
Product Camp vol.6 meetup, Singapore
August 27, 2016
Pavel Kudinov, RedMart
79% data scientists say, that most of the time is data munging
In 2009, data scientist Mike Driscoll popularized the term “data munging,” describing the “painful process of cleaning, parsing, and proofing one’s data”
What was the previous name? Who updated this field?
What was the previous birthday? Who updated this field?
Not reliable. Owned by DevOps and ALWAYS considered as not important after couple months
Depends on the engineer. After refactoring or due to wrong configuration log might not be written
Unstructured format. Requires parsing and special care
If it is structured (audit trail), what about atomicity and transactions?
...
...
...
As a conference host I want to be able to set maximum number of participants
As a customer I want to be able to reserve a spot on the conference
Set capacity to 100
Add reservation for 1
Add reservation for 5
Add reservation for 2
Remove reservation for 1
No Persisted State.
Only event journal.
Create cart
Add item #1
Set delivery time
Add item #2
Change Quantity of item #1
Change delivery time
No Persisted State.
Only event journal.
Display audit log: who did what and when
Store the history of changes in the same DB
Track changes for all business entities
Reports that show time distribution between events
Event-driven VS Scheduled dependencies
By Pavel Kudinov