Data Science & AI at Jusbrasil

The past, the present and the Future

Agenda

Context: Why Data Science and AI?

Agenda

What the leads team has been up to

Context: Why Data Science and AI?

Agenda

What the leads team has been up to

Context: Why Data Science and AI?

What's next

Why Data Science and AI?

Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics.

"Data Science enables the creation of data products.

Whether data is search terms, voice samples or product reviews, users are in a feedback loop in which they contribute to the products they use.

That's the beginning of Data Science"

 

- Mike Loukides, 2010

"Data Science enables the creation of data products.

Whether data is search terms, voice samples or product reviews, users are in a feedback loop in which they contribute to the products they use.

That's the beginning of Data Science"

 

- Mike Loukides, 2010

How Data Science is done

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

Ingest Raw Data

Transactions

Web Scraping

Mobile data

Sensor data

Social feed

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

Ingest Raw Data

Transactions

Web Scraping

Mobile data

Sensor data

Social feed

Crunch Data

MapReduce

ETL, ELT

Data Wrangle

Dim Reduction

Data Cleansing

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

Ingest Raw Data

Transactions

Web Scraping

Mobile data

Sensor data

Social feed

The Dataset

Independency?

Correlation?

Covariance?

Causality?

Dimensionality?

Crunch Data

MapReduce

ETL, ELT

Data Wrangle

Dim Reduction

Data Cleansing

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

Ingest Raw Data

Transactions

Web Scraping

Mobile data

Sensor data

Social feed

The Dataset

Independency?

Correlation?

Covariance?

Causality?

Dimensionality?

Crunch Data

MapReduce

ETL, ELT

Data Wrangle

Dim Reduction

Data Cleansing

Learn From Data

Inference

Data & Algorithm Models

Machine Learning

Regression & Prediction

Classification & Clustering

How Data Science is done

The world

Product

System

# of cases

# of users

whatnot

Ingest Raw Data

Transactions

Web Scraping

Mobile data

Sensor data

Social feed

The Dataset

Independency?

Correlation?

Covariance?

Causality?

Dimensionality?

Crunch Data

MapReduce

ETL, ELT

Data Wrangle

Dim Reduction

Data Cleansing

Learn From Data

Inference

Data & Algorithm Models

Machine Learning

Regression & Prediction

Classification & Clustering

Deliver and Visualize insight

Actionable

Predictive

Business Value

Easy to explain

Answer and new questions

Data Science: Explain like I'm 5 

A fuckton of data

Mathemagics

Data Science: Explain like I'm 5 

Insights that you couldn't have imagined

... And predictions, tons of predictions.

What the leads team has been up to

(besides surviving vicente's philosophical ideas every 2 hours)

What the leads team has been up to

Vicente (for real)

Legal Issue Classifier

Classificação de casos automática com alta precisão de acerto

Legal Issue Classifier

Classificação de casos automática com alta precisão de acerto

Remoção de um campo do form. reduzindo fricção e gerando cerca de +3500 casos a mais por mês

Legal Issue Classifier

Churn study

Question: What's the pattern among subscribed users? Why they churn?

Question: Is there any interesting correlation between our current features related to user-behavior? 

Features

Apresentação (bio)

NotificarLeadEmail

NotificarLeadEmail

NotificarLeadSite

ReceberDigest

TotalMeusDocumentos

DuvidasLidas

RespostasAvaliadasCasos

TotalNotificacoesNaoLidas

VisivelLista

VisualizacoesTelefone

TotalDuvidasRecebidas

RespostasLidasCasos

RespostasCasos

Features

Is there any correlation?

Features

Features

What are the most important features that draw a line between churned and non-churned users?

[0] apresentacao

[1] notificarLeadEmail

[2] notificarLeadSite

[3] receberDigest

[4] totalMeusDocumentos

[5] duvidasLidas

[6] respostasAvaliadasCasos

[7] respostasCasos

[8] respostasLidasCasos

[9] totalDuvidasRecebidas

[10] totalNotificacoesNaoLidas

[11] visivelLista

[12] visualizacoesTelefone

 

Why and how?

Preliminar conclusions

In plain words

Advogados que recebem notificação por email tendem a manter a assinatura

#1

Advogados que recebem notificação por email tendem a manter a assinatura

#1

Dos usuários que cancelaram, 72% optaram por não receber notificação por email, enquanto apenas 16% dos usuários ativos optaram por não receber notificação por email

Há uma correlação positiva entre a quantidade de casos lidos por um advogado e a sua permanência como assinante

#2

Usuários que optam por receber digest tendem a não cancelar

#3

Usuários que optam por receber digest tendem a não cancelar

#3

35% dos usuários que cancelaram não recebem digest, enquanto apenas 10% dos usuários ativos optam por não receber digest

Churn correlaciona negativamente com todas as features selecionadas

#4

Churn correlaciona negativamente com todas as features selecionadas

#4

Todas são relacionadas a engajamento do usuário

Logo, quanto menor o engajamento do assinante, maior a chance dele cancelar

#4

Quanto maior o tempo como assinante, menor a chance de cancelamento

#5

Quanto maior o tempo como assinante, menor a chance de cancelamento

#5

Churn rate de assinantes com menos de 3 meses de assinatura: 46%

Quanto maior o tempo como assinante, menor a chance de cancelamento

#5

Churn rate de assinantes com menos de 3 meses de assinatura: 46%

Churn rate de assinantes com tempo de assinatura entre 3 meses e 6 meses: 42%

Quanto maior o tempo como assinante, menor a chance de cancelamento

#5

Churn rate de assinantes com menos de 3 meses de assinatura: 46%

Churn rate de assinantes com tempo de assinatura entre 3 meses e 6 meses: 42%

Churn rate de assinantes com tempo de assinatura entre 6 meses e 1 ano: 34%

Notificação de leads por email, respostasAvalidasCasos e respostasCasos tem maior impacto na distinção entre usuários que cancelam e não cancelam.

#6

What's next

Hot Or Cold Predictor

Hot Or Cold Predictor

Given a legal case, is it a hot case or not?

Churn prediction model

Churn prediction model

Given a subscribed user, what's the likelihood that he/she cancels?

Churn prediction model

Given a subscribed user, what's the likelihood that he/she cancels?

What can we do to prevent it?

WTF can we do with all these new information and insights?

This is up to all of us.

Thanks!

Data Scie

By Rodrigo Araújo

Data Scie

  • 1,923