Business inteligence

Mateus chagas

@matchs


"[BI] refers to a set of tools and techniques that 
enable a company to transform its business 
data into timely and accurate information for the decisional process, to be made available to the right persons in the most suitable form.”


Encyclopedia of Database Systems                       


a very wide field


Data Mining

Data Warehousing

Online Analytical Processing

Predictive analysis

(...)


A friendly interpretation of voluminous data

decision making process

Information and knowledge from the backbone of the decision making process

BI & WEB


Costumer behavior is hardly observed by traditional methods.

Websites can store logs for analysis of costumer behavior.

Amount of stored data usually grows very fast.

Can combine traditional user research data with web data.

COMMON PROBLEMS OBTAINING DATA


Many different systems and databases
(JusBrasil, Sentry, Kissmetrics, Google Analytics, ChartBeat ...)

Data quality is not always consistent

Some potentially important data is "volatile"
(Deletion or change over time)

DATA WAREHOUSING

An analysis' environment

A store of information organized and unified

Data collected from different sources

"Raw" data is needed


etl

Extract

Transform

Load


Data exists in the most variable forms

(text files, excel tables, databases etc...)


Up to 80% time is spent on ETL

MULTI-DIMENSIONAL MODELING


Facts

Dimensions


Fact
User 997445 opened  page A at 5:35:12 on Sunday August 25 of 2013

Dimension
User {id; name; password; birth date;}
Pages {title; url; html; keywords;}
 Time {hour;  minute; second; dow; month; day; year;}

THE STAR SCHEMA


SIZE MATTERS


The granularity of a fact is very important

Smaller granularity often means more detailed information

Often the granularity is a single business transaction
(User 99654 signed "PRO profile" on "Direito do Consumidor" at "Salvador" at 5:34:23 on Sunday, August 25 of 2013, for R$19,90)

DESIGNING A DW

Choose a model for a business process
Choose the granularity for it
Choose the dimensions
Choose the facts/measures

THE CUBE

 Many dimensions 

 Cells

 Selection and Grouping of data

All possible combinations of aggregations between dimensions

Dimensions values may have ordering 
(like time)


OLAP

Oline Analytical Processing


Summarizes the data before querying (building the cube)

COMMON OLAP OPERATIONS



Slice

Dice

Drill up/down

Pivot

Slice & DICE


DRILLING

PIVOT

ON TOP OF OLAP


Report Applications

Data Mining Applications

Business Analysis

(...)

TYPES OF ANALYSIS

Historical

Operational

Predictive

Exploratory

(...)

BI vs BIG DATA


Some say Big Data is killing traditional BI practices

Some say they are complementary

Traditional BI only answers what you're asking

BI is very good for indicatives and historic analysis

Big Data allows smaller cycles of time


What about us?


 How errors/bugs in our site affect our sales? 

 What's the relation between our user base's growth and PRO leads? 

 What period of the year gives us more PRO leads? And why? 

 Does any external events affects our sales? 

 What's the relation between the number of deploys of new features and our sales? 

that's all folks!

Any questions?

BI

By Mateus Chagas Sousa