Using SOA to Scale Your App and Your Team
London Web Performance Group meetup, 2015-04-09
Laurie Voss
CTO of npm Inc
@seldo
What are we talking about?
Service Oriented Architecture (SOA)
- What SOA is
- Why it's good
-
Why SOA is getting popular (again)
- How npm uses SOA
- Best practices for using SOA
Impostor Syndrome Disclaimer
Speed up your website with this 1
weird trick discovered by a mom!
What is ?
npm is not one thing
-
npm the CLI
- npm the public registry
- npm Enterprise
- www.npmjs.com
What is SOA?
Over-simplified example time!
This is not a new idea
Monolithic apps
They're not that bad!
MVC: the ubiquitous monolith
As seen in Ruby on Rails, etc.
"Oh, it's like Rails. Okay."
ORM is an antipattern:
http://sel.do/orm
Mo' lithic, mo' problems
Monolithic apps scale, but expensively
Scaling monolithic architecture
Monoliths get expensive and slow
Scaling is easy
If you have infinite money
Scaling this way is dumb
You're paying for
the wrong stuff
You can be dumb
for a long time
Everybody is doing it
"The app is down"
means
"The company is down"
Scaling monolithic teams
https://www.flickr.com/photos/marypmadigan/4173943493/
Big teams move slower
SOA to the rescue
A picture of the Rescue Rangers here
would be a copyright violation, but
you can imagine them for free.
SOA is a distributed system
Features of distributed systems:
- Concurrency
- No global clock
- Independent failure of components
Race conditions galore!
Architectural advantages
1. Distributed systems fail more gracefully
(if you design them properly)
Better use of resources
Do more by doing less
Better resource allocation = do more
Distributed systems
are easier to scale
If you have infinite hardware
(and you do!)
Scale the busy parts
Operational advantages
Pick the appropriate level of redundancy
Faster debugging
There's only one thing that can possibly go wrong.
Team advantages
Our monkey brains are small.
Don't tax them unnecessarily.
Smaller systems = smaller teams
Calm down, math(s) nerds
Stronger ownership
means
better code
Development advantages
Smaller systems = less stuff to break
Better isolation = less chance of breaking stuff
Move fast
and don't break things
These advantages are nontrivial
(If you design it right)
Cost advantages
SOAs run cheaper
But it's not all roses
Mo' pieces, mo' problems
Probability math:
-
If any machine will fail = 0.1
- 1-box system will fail = 0.1
Math!
Same 0.1 hardware, 10-box system:
- partially fail = 1.0
- totally fail = 0.1^10 = 0.0000000001
Always slightly down.
Never totally down.
You need redundancy
You needed it anyway, though.
Redundancy is easier now
(If you design it right)
Deployment automation
is essential
We use Ansible. It's nice!
Interface definition
Versioned APIs will help.
Keep your contracts
Or you will have lots of boring meetings.
Decoupling
You have to get all your data over the network.
Everything's an API
Defining data sources strictly means
fewer unexpected side-effects.
Requires forethought
Or a crystal ball.
Shared logic
We already re-invented this wheel!
You need shared libraries
Obvious product plug incoming
Hey, that's that thing we do!
(What a coincidence!)
npm install all-the-things
Private modules are neat
npm install @mycompany/secret-thing
You need versioned APIs
And you need to support old behaviour for longer.
Distributed systems are hard
"Backpressure" is a fancy word for
"this thing is too slow".
Race conditions
will happen out of order.
Anything that can happen out of order,
Plan for failure
Why SOA now?
We like changing things for no reason.
Technical change 1:
Virtualization
The phrase "the cloud"
can always be replaced with
"some computers in eastern Virginia"
Real hardware
involves carrying
actual heavy things
Real hardware is awful
Virtualization is awesome
Virtual hardware
=
disposable hardware
You won't always get to pick when it's disposed of.
Deploy to new hardware
Why roll back software when you can roll back hardware?
Technical change 2:
distribution by default
We had all this HTTP lying around...
The Internet is one big SOA
Big change 3: cost
Smaller, cheaper, faster.
Hardware is cheap
Developers are expensive
Enough Why. Time for How.
npm registry architecture
npm registry architecture 2
npm registry architecture 3
Reality is harder
Not shown:
- Multiple datacenters
- Redundant hardware
- Balancer proxies
- Nginx for SSL termination
- Legacy shims
- Backup systems
- External replication
Metadata store
CouchDB
That's the registry
npm Enterprise
Scaling down
License API is external
Policy follower
What npm Enterprise is for
Control of binaries
What npm Enterprise is for
Control of access
What npm Enterprise is for
Legal requirements
Enter the Policy Follower
Filters incoming replication
Policies are code
npm install @mycompany/our-policy
Licensing filters
This is SOA being great
SOA scaling saves us dev time
Fix bugs once only.
But wait, there's more!
www is lightweight
Registry API
A relational registry
Me loving SQL:
http://sel.do/sql
Putting it all together
Lesson 1: deployment automation
Lots of boxes = lots of failures
Take crappy jobs away from humans
SOAs are more complicated
Many boxes = many configs
Service management
Cross-OS services are hard
Enter ndm
service.json
more info:
http://npm.im/ndm
How we do
deployment automation
Ansible playbooks
Services by ndm
One-click deploys
Deploy by git push
Deployment automation
gives you time back
Configuration management
Too much to remember
The configurator
Pulls everything from etcd
Standard config is better
Service monitoring
Don't fail silently
Monitor everything
Service-specific checks
Using AWS simply
Pretty ranty old blog post:
http://sel.do/aws
We use AWS for everything
We don't use
everything from AWS
Reason 1: because we can
Not everybody can
Reason 2: control
Control is a luxury
We are willing to pay for it.
Reason 3: community
Reason 4: portability
Not ready to commit
It's not you, AWS. It's us.
Metrics
Why metrics are important
Metrics are a debugging tool
"Dashboard" is not a dirty word
Dashboards saved us $60,000
Scaling Node.js services
SSL termination sucks
Use balancers for multiple cores
Beware maxSockets
http://sel.do/maxsockets
What about io.js?
In Summary...
1. Monolithic apps are bad
Sort of. Sometimes.
2. SOA is good
Sort of. Sometimes. Mostly.
3. SOA is easier than ever
Thanks, Amazon!
4. npm is SOA all the way down
5. Our best practices
-
deployment automation
-
service management
-
configuration management
-
service monitoring
-
metrics
6. Bonus: node.js best practices
- use balancers to terminate SSL
- use balancers to maximize cores
- set maxSockets to Infinity
- probably don't use CouchDB for anything
Thanks!
@seldo
laurie@npmjs.com
SOA for LWPG meetup
By seldo
SOA for LWPG meetup
- 5,071