ASHESH, SHUBHAM, ASHER
Software environment for statistical computing and graphics
What is R?
R Software Environment
R Language
R specific tools and Packages
_________________________________
+
think of R as having a programming language than being a programming language.
Designed by statisticians, for statisticians.
It is. .
- Multi-Paradigm
- Cross-platform
- Free
- Open source (GNU GPL Licence)
- Community based
- been in development & use for 22+ years
Business Intellignce Technology Framework
R for BI
R has naturally (developed for decades) broad range of statistical tools available (multiple repositories with thousands of packages). I will skip this enormous feature of R and just focus on simple BI case of extraction, transformation, loading and presentation.
Below are listed packages which directly address the steps in basic BI process.
Extraction
- DBI - native database drivers for multiple vendors, top performance.
- RODBC - ODBC database driver connection.
- RJDBC - JDBC database driver connection.
- data.table's fread - very fast csv files reader.
- tons other packages to support different format of data (e.g.: xlsx, xml, json, sas, spss, stata).
Transformation
- data.table - powerful data transformation tool, uses from[ where, select|update, group by ][...] syntax.
- dplyr - also powerful, but less scalable, data transformation tool, usesfrom %>% where %>% group by %>% select %>% ... syntax. Pivot and unpivot (cast and melt) are located in tidyr package.
Presentation
Presentation of useful information is totallly different task than ETL process, it can be easily outsourced to any BI dashboard tool by simply populating the data structure expected by particular tool.
Yet when using R you don't even need to push prepared data to external presentation tool.
You can produce a web application dashboard directly from R.
- shiny - Web Application Framework for R.
- opencpu - HTTP API to R.
- httpuv - HTTP and WebSocket server library, also the core of shiny package.
-
Rook - web server interface for R.
Using mentioned packages you are capable to host interactive web applications. Those can generate interactive plots, interactivly query the data.
The Good
- Designed by Statisticians
- can be extremely Flexible
- comprehensive extension library
- Fantastic reporting Capabilities
- Incredibly popular
- Open source
- Free
- Cross platform (OS and Architecture)
The Bad
- Designed by Statisticians
- Has a steep learning curve
- Inheritably single Threaded
- Suffers from pitfalls of being Open Source
- no one to complain to if something doesn't work
- Performance issues
R for Business Intelligence
By Ashesh Kumar
R for Business Intelligence
- 751