Language Cloud



JAWS DAYS 2014



#jawsdays #ijaws

THE SPEAKER

 

Software Architect 

@ LanguageCloud.co 


Security, Performance,   Pragmatism 

From Barcelona (Spain)

LANGUAGE CLOUD

DATA DRIVEN COMPANY


White: No users
Black: Some users
Red: Most users

User vs Class

MAIN REASONS TO CHOOSE AWS

10 regions with 26 zones

SaaS business model


Small team

2 Business, 3 Frontend, 1 Backend


Cannot predict our evolution 

High risk associated

WARNING !!!

Never forget your code

PAST VERSION

DynamoDB


Manage segments (max. 64kb) 

Calculate/predict R/W Capacity 

Debug, test and edit values 

Pattern Aggregator used

DynamoDB


  1. ./dyn_dumpTable.js <table>

  2. grep table.json 

  3. ./dyn_dumpRow.js <table> <id> <range>

  4. edit row.json

  5. ./dyn_uploadRow.js <file>

S3 + CloudFront


Files and data easily disorganized 



Only 3 price class distributions

Opsworks 


= Chef + Dashboard (Agent)

Fast degradation (CPU & connections reached) 

Multiple responsibilities (SSL, S3, ...) 

No stateless (local caches)

CURRENT VERSION

Route53

Machine Jenkins (Oregon)


  • Pipeline APP (AngularJS)
  • Pipeline API (NodeJS)
  • Pipeline Worker (NodeJS)

Cheapest Region (no low-latency required)

Master is Reserved (c3.large)
Workers will be Spot Request

CI Manual Triggers


  1. Push to branch Develop

  2. Accept pull-request in Master

  3. XMPP Deploy command (W.I.P.)


Machine DevServices


Jabber/XMPP server (Prosody IM)

Channel



Supervisor system


Monit: Auto restart services - 1min

MultiAgent (XMPP): Machines online + Agent

HipChat: Register all warnings (@all)

AWS (CloudWatch): Email alarms - 3x1min

AWS (AutoHealth): Elastic Load Balancers, OpsWorks, R53, ...

Machine RedisCache

Machine GraphStats

Using cookbook StatsD


Machine TextLogs



80 % requests, < 10 ms


Machine WorkerStats


Dashboard (MySQL
Analytics (MySQL

MailQueue (Mandrill)
Webhooks (Mandrill
Reply-To (Mandrill

Patterns (ElasticSearch)

Export data (Excel, CSV, ...)

Single db connection

S3 + Redactor


SCALABILITY

CPU

Just add more machines

(stateless)


1 machine - 1 core - 1 app

vs

1 machine - N cores - N-1 app


( Cluster & Domain libraries )

I/O (network bandwidth)


SlaveOf in Redis
(We use 2 connections: Read & Write)

Replication or Sharding in MySQL
(Our ORM allows connect to multiple DBs)

Hard drive


  • Data files are allocated in S3 buckets

  • MySQL uses 100Mb of 5Gb

    • Student submissions (30%)

  • ElasticSearch uses 1Gb of 40 Gb

  • Graphite (Whisper) uses 300Mb of 40Gb



Memory


Increase machine vertically (max 244Gb)

Flush Redis in shorter intervals

Migrate to Riak, Cassandra, ... 
(Change latency by availability) 

EXTERNAL SERVICES

HipChat

MandrillApp

MixPanel

inVISION

Transifex

Asana

GitHub

WORKSPACE

Test-Driven Development



statusMachine.sh

 Services:  1)  Apache2 = ON 2)    mod_xdebug = OFF 3)    mod_apc = OFF 4)  MySQL = ON 5)    /rammysql = OFF 6)    -log is OFFLaunch Jmeter: ja) API jw) WorkerStatsLaunch APP: app) http://localhost:9000/Update repos: repo) app+api+worker+devtools+cookbook+webpage >>> 

awsConnect.sh

MySQL:
 0) RAW @ RDS-MYSQL (v3)
EC2:
 1) [EIP] SSH @ GraphStats		=> http://localhost:2180/ (Graphite)
 2) [EIP] SSH @ TextLogs		    	=> http://localhost:5601/ (Kibana)
 3) [EIP] SSH @ TextLogs		    	=> http://localhost:9200/ (API)
 4) [DYN] SSH @ RedisCache		=> tcp://localhost:16379/ (Console)
 5) [DYN] SSH @ DevServices
 6) [DOM] SSH @ WorkerStats		=> http://localhost:2812/ (Monit)
OPWS:
 7) SSH @ dev-api3-2		    		=> http://localhost:2812/ (Monit)
 8) SSH @ prod-api3-1	    			=> http://localhost:2812/ (Monit)
CI:
 o) [EIP] SSH @ Jenkins-Oregon		=> http://localhost:18081/ (Jenkins)
Backups: b) Backup W+E+R ba) Backup AWS-S3 be) Backup Elasticsearch bg) Backup Graphite bm) Backup MySQL Redis Flush: rd) ADM @ dev-api3-2 rp) ADM @ prod-api3-1 List AWS - instances: li) ap+us+eu Update & Upgrade: uu) ap+us+eu
>>>
aws cli & ./jq

Backups


Server > Live | Backup
Local > Live | Backup

  • ElasticSearch + LogStash = Incremental
  • Graphite = Full
  • RDS = mysqldump + Snapshots
  • S3 bucket = s3cmd sync

dbConsistency.sh


ANY QUESTIONS?

  

https://slid.es/sergioarcos/

sergio@languagecloud.co

Language Cloud (Use Case in AWS) v2

By Sergio Arcos

Language Cloud (Use Case in AWS) v2

  • 5,484