Business inteligence
Mateus chagas
@matchs
"[BI] refers to a set of tools and techniques that
enable a company to transform its business
data into timely and accurate information for the decisional process, to be made available to the right persons in the most suitable form.”
Encyclopedia of Database Systems
a very wide field
Data Mining
Data Warehousing
Online Analytical Processing
Predictive analysis
(...)
A friendly interpretation of voluminous data
decision making process
Information and knowledge from the backbone of the decision making process
BI & WEB
Costumer behavior is hardly observed by traditional methods.
Websites can store logs for analysis of costumer behavior.
Amount of stored data usually grows very fast.
Can combine traditional user research data with web data.
COMMON PROBLEMS OBTAINING DATA
Many different systems and databases
(JusBrasil, Sentry, Kissmetrics, Google Analytics, ChartBeat ...)
Data quality is not always consistent
Some potentially important data is "volatile"
(Deletion or change over time)
DATA WAREHOUSING
An analysis' environment
A store of information organized and unified
Data collected from different sources
"Raw" data is needed
etl
Extract
Transform
Load
Data exists in the most variable forms
(text files, excel tables, databases etc...)
Up to 80% time is spent on ETL
MULTI-DIMENSIONAL MODELING
Facts
Dimensions
Fact
User 997445 opened page A at 5:35:12 on Sunday August 25 of 2013
Dimension
User {id; name; password; birth date;}
Pages {title; url; html; keywords;}
Time {hour; minute; second; dow; month; day; year;}
THE STAR SCHEMA

SIZE MATTERS
The granularity of a fact is very important
Smaller granularity often means more detailed information
Often the granularity is a single business transaction
(User 99654 signed "PRO profile" on "Direito do Consumidor" at "Salvador" at 5:34:23 on Sunday, August 25 of 2013, for R$19,90)
DESIGNING A DW
Choose a model for a business process
Choose the granularity for it
Choose the dimensions
Choose the facts/measures
THE CUBE
Many dimensions
Cells
Selection and Grouping of data
All possible combinations of aggregations between dimensions
Dimensions values may have ordering
(like time)
OLAP
Oline Analytical Processing
Summarizes the data before querying (building the cube)
COMMON OLAP OPERATIONS
Slice
Dice
Drill up/down
Pivot
Slice & DICE

DRILLING

PIVOT

ON TOP OF OLAP
Report Applications
Data Mining Applications
Business Analysis
(...)
TYPES OF ANALYSIS
Historical
Operational
Predictive
Exploratory
(...)
BI vs BIG DATA
Some say Big Data is killing traditional BI practices
Some say they are complementary
Traditional BI only answers what you're asking
BI is very good for indicatives and historic analysis
Big Data allows smaller cycles of time
What about us?
How errors/bugs in our site affect our sales?
What's the relation between our user base's growth and PRO leads?
What period of the year gives us more PRO leads? And why?
Does any external events affects our sales?
What's the relation between the number of deploys of new features and our sales?
that's all folks!
Any questions?
BI
By Mateus Chagas Sousa
BI
- 350

