Modern Analytics for web and Mobile 

No  Code  Required
AWS  enabling  infinitely  scalability

Overview of Architecture




Multiple choices of HID event Collector

  • Traditional  on-demand  HID event collector  (Free)
User need write codes within application of web and mobile then collecting HID event (on-demand) for analysis

  • Advanced no code required HID event collector    (With reasonable fee per megabyte or gigabyte)
User won'e need write any codes within application of web and mobile, and collector will automatically collect all HID events for analysis

The Protocol for HID event transfer 

[   HTTP   ]

  Method: POST                                         Compression: Gzip 
  Format: JSON or MsgPack 
  Cost:  Free                                                    Shared : global

[   WebSockets  ]

Method:  Long alive socked connection        Compression: LZ4

 Format: JSON or MsgPack
 Cost: Reasonable fee per mega- or giga-byte

 Shared: exclusively

Collector Load balancer

[ Very top front layer]

Type:  Amazon Route 53      

Consequence:  Won't meet the slowly speed ramp up as Amazon Elastic Balancer  [REF]


[Application Layer]

Cost:  Free
    • Nginx Plus          (Http, Https, WebSockets, Cluster-able)
    • HaProxy + KeepAlived (Http, Https, WebSockets, Failover)
Cost: Reasonable fee per megabyte or gigabyte
       *    Elastic IP + VPC + Nginx Plus
       *    Elastic IP + VPC + HaProxy + KeepAlived

Collected Event Process

[  Slow  ]

          Instance Storage Log           +         Flume log collector


[  Fast  ]
      No local log              +           Push to Kafka cluster when received

EVent Parsing

  • Storm enable large scale in-memory filter, transform and persist to hdfs cluster.

  • Failed process will retry since the event data will persist in kafka for later retrying.

  • You can choose the worker  thread number for better scalability and reasonable latency. (Remind: thread number above #limit number will cause reasonable fee)

Mining System

  • SPARK
  • SPARK SQL
  • Mlib
  • HBase

You can define the Volume on-demand

Data WAREHOUSE

  • Druid.io

  • Crate.io

We will provide realtime and offline data warehouse 

Reference:   Big Data

4 ACCUMULATIONS


Experience

Data

Procedure

Marketing

Ending but not Finals

  • All  analysis of app or website will be provided timely and incrementally.

  • Prediction and precaution will be provided based on user settings.

  • Can be extended into next level for building up big data ecosystem for app or website, which will reduce cost and allow them focus on the mainlane business.

Modern Analytics For Web and Mobile

By Andy Song

Modern Analytics For Web and Mobile

  • 736