Stop Deleting the Data

Designing services that won't forget anything

Product Camp vol.6 meetup, Singapore

August 27, 2016

 

Pavel Kudinov, RedMart

Data

Behaviour

Reports

Predictive Analytics

Data Science

79% data scientists say, that most of the time is data munging

In 2009, data scientist Mike Driscoll popularized the term “data munging,” describing the “painful process of cleaning, parsing, and proofing one’s data”

Data in Business Applications

Transactions

 - create

Objects

 - update

Problem #1: We delete data

  • “Delete” buttons should not delete anything, but mark as deleted

  • Always preserve all data

  • Backups, archive data

Problem #2: We update data

Update customer name

What was the previous name? Who updated this field?

 

Update birthday

What was the previous birthday? Who updated this field?

Previous data values are lost forever

Application Log is not Data

  • Not reliable. Owned by DevOps and ALWAYS considered as not important after couple months

  • Depends on the engineer. After refactoring or due to wrong configuration log might not be written

  • Unstructured format. Requires parsing and special care
    If it is structured (audit trail), what about atomicity and transactions?

Example: Cart in e-commerce grocery store

How customers add items to the cart?

How they remove items from the cart?

How they change quantities?

...

Example: Driver profile

Who made the change?

What fields are changed and how often?

What about misuse?

...

Example: Shipment

Who made the change?

What was happening?

Where (GPS)?

Where (Facility)?

...

Example: RSVP on Conference

As a conference host I want to be able to set maximum number of participants

 

As a customer I want to be able to reserve a spot on the conference

Status Quo

Event Journal

Log IS the Data

Set capacity to 100

Add reservation for 1

Add reservation for 5

Add reservation for 2

Remove reservation for 1

No Persisted State.

Only event journal.

Log IS the Data

Create cart

Add item #1

Set delivery time

Add item #2

Change Quantity of item #1

Change delivery time

No Persisted State.

Only event journal.

Setting requirements for engineers

  • Display audit log: who did what and when

  • Store the history of changes in the same DB

  • Track changes for all business entities

  • Reports that show time distribution between events

  • Event-driven VS Scheduled dependencies

Process Mining

Process Mining

Q&A

Pavel Kudinov

Product Camp vol.6: Stop Deleting the Data

By Pavel Kudinov

Product Camp vol.6: Stop Deleting the Data

  • 400