Language Cloud

JAWS DAYS 2014

#jawsdays #ijaws
THE SPEAKER

Software Architect
From Barcelona (Spain)
LANGUAGE CLOUD
DATA DRIVEN COMPANY
-
Teacher creates content
Teacher improves methodology through statistics and metrics -
Student completes exercises
Student receives personalized recommendations based on needs

White: No users
Black: Some users
Red: Most users
User vs Class
MAIN REASONS TO CHOOSE AWS
10 regions with 26 zones
SaaS business model
Small team
2 Business, 3 Frontend, 1 Backend
Cannot predict our evolution
High risk associated
WARNING !!!
Never forget your code
PAST VERSION
DynamoDB
Calculate/predict R/W Capacity
Debug, test and edit values
Pattern Aggregator used
DynamoDB
- ./dyn_dumpTable.js <table>
- grep table.json
- ./dyn_dumpRow.js <table> <id> <range>
- edit row.json
- ./dyn_uploadRow.js <file>
S3 + CloudFront
Only 3 price class distributions
Opsworks
= Chef + Dashboard (Agent)
CURRENT VERSION
Route53
Machine Jenkins (Oregon)
-
Pipeline APP (AngularJS)
-
Pipeline API (NodeJS)
- Pipeline Worker (NodeJS)
Cheapest Region (no low-latency required)
Master is Reserved (c3.large)
Workers will be Spot Request
CI Manual Triggers
- Push to branch Develop
- Accept pull-request in Master
- XMPP Deploy command (W.I.P.)
-
How to Using Chef Deployment Hooks
-
How to
Automatically Running Recipes
-
How to AWS OpsWorks Lifecycle Events
Machine DevServices
Jabber/XMPP server (Prosody IM)

Channel

Supervisor system
Monit: Auto restart services - 1min
MultiAgent (XMPP): Machines online + Agent
HipChat: Register all warnings (@all)
AWS (CloudWatch): Email alarms - 3x1min
AWS (AutoHealth): Elastic Load Balancers, OpsWorks, R53, ...
Machine RedisCache
Machine GraphStats
Using cookbook StatsD
Machine TextLogs

80 % requests, < 10 ms
Machine WorkerStats
S3 + Redactor

SCALABILITY
CPU
Just add more machines
(stateless)
I/O (network bandwidth)
(We use 2 connections: Read & Write)
Replication or Sharding in MySQL
(Our ORM allows connect to multiple DBs)
Hard drive
- Data files are allocated in S3 buckets
- MySQL uses 100Mb of 5Gb
- Student submissions (30%)
- ElasticSearch uses 1Gb of 40 Gb
- Graphite (Whisper) uses 300Mb of 40Gb
Memory
Flush Redis in shorter intervals
Migrate to Riak, Cassandra, ... (Change latency by availability)
EXTERNAL SERVICES
HipChat
MandrillApp
MixPanel
inVISION
Transifex
Asana
GitHub
WORKSPACE
Test-Driven Development
statusMachine.sh
Services:
1) Apache2 = ON
2) mod_xdebug = OFF
3) mod_apc = OFF
4) MySQL = ON
5) /rammysql = OFF
6) -log is OFF
Launch Jmeter:
ja) API
jw) WorkerStats
Launch APP:
app) http://localhost:9000/
Update repos:
repo) app+api+worker+devtools+cookbook+webpage
>>>
awsConnect.sh
MySQL: 0) RAW @ RDS-MYSQL (v3) EC2: 1) [EIP] SSH @ GraphStats => http://localhost:2180/ (Graphite) 2) [EIP] SSH @ TextLogs => http://localhost:5601/ (Kibana) 3) [EIP] SSH @ TextLogs => http://localhost:9200/ (API) 4) [DYN] SSH @ RedisCache => tcp://localhost:16379/ (Console) 5) [DYN] SSH @ DevServices 6) [DOM] SSH @ WorkerStats => http://localhost:2812/ (Monit) OPWS: 7) SSH @ dev-api3-2 => http://localhost:2812/ (Monit) 8) SSH @ prod-api3-1 => http://localhost:2812/ (Monit) CI: o) [EIP] SSH @ Jenkins-Oregon => http://localhost:18081/ (Jenkins)
Backups: b) Backup W+E+R ba) Backup AWS-S3 be) Backup Elasticsearch bg) Backup Graphite bm) Backup MySQL Redis Flush: rd) ADM @ dev-api3-2 rp) ADM @ prod-api3-1 List AWS - instances: li) ap+us+eu Update & Upgrade: uu) ap+us+eu
>>>
Backups
Server > Live | Backup
Local > Live | Backup
- ElasticSearch + LogStash = Incremental
- Graphite = Full
- RDS = mysqldump + Snapshots
-
S3 bucket = s3cmd sync
dbConsistency.sh

ANY QUESTIONS?

sergio@languagecloud.co
Language Cloud (Use Case in AWS) v2
By Sergio Arcos
Language Cloud (Use Case in AWS) v2
- 5,907