Stop Deleting the Data

Designing services that won't forget anything

ProductTank Meetup, Singapore

June 9, 2016

 

Pavel Kudinov, RedMart

Data

Behaviour

Reports

Predictive Analytics

Data Science

79% of time is data munging

In 2009, data scientist Mike Driscoll popularized the term “data munging,” describing the “painful process of cleaning, parsing, and proofing one’s data”

Data in Business Applications

Transactions

 - create

Objects

 - update

Problem #1: We delete data

  • “Delete” buttons should not delete anything, but mark as deleted

  • Always preserve all data

  • Backups, archive data

Problem #2: We update data

Update customer name

What was previous name? Who updated this field?

 

Update birthday

What was previous birthday? Who updated this field?

Previous data values are lost forever

Application Log is not Data

  • Not reliable. Owned by DevOps and ALWAYS considered as not important after couple months

  • Depends on the engineer. After refactoring or due to wrong configuration log might not be written

  • Unstructured format. Requires parsing and special care
    If it is structured (audit trail), what about atomicity and transactions?

Example: Cart in e-commerce grocery store

How customers add items to the cart?

How they remove items from the cart?

How they change quantities?

...

Example: Driver profile

Who made the change?

What fields are changed and how often?

What about misuse?

...

Status Quo

Event Sourcing

Log IS the Data

Create cart

Add item #1

Set delivery time

Add item #2

Change Quantity of item #1

Change delivery time

No Persisted State.

Only event journal.

Setting requirements for engineers

  • Display audit log: who did what and when

  • Store the history of changes in the same DB

  • Track changes for all business entities

  • Reports that show time distribution between events

  • Event-driven VS Scheduled dependencies

Process Mining

Q&A

Pavel Kudinov

Made with Slides.com