DCTCP
Data-Center TCP
Microsoft
Research &
Stanford
-------
Maxime DIDIER
Paul CHAROUSSET
Yannick PÉROUX
The Problem
Real-World problem:
How to
avoid congestion
in
Data-Centers
?
Make use of existing
commodity hardware
We can't reinvent the wheel !
Tweaking TCP
in order to avoid congestion
The Problem
Their setup:
6000 servers
Commodity hardware
2 kinds of flow:
Short-lived
Distributed tasks
Low latency
Tasks dropped after a timeout
Long-lived
Background tasks
Large volume
Less critical
The Problem
Background tasks ->
~90%
of the traffic
Fill-up the switches' buffers
Common solutions :
Use Round-Trip Time
Active Queue Management
The Solution
Keep buffer usage low
Avoid
queue buildup
Avoid
buffer pressure
Plenty of
room
to handle
incast bursts
The Solution
Make a better use of the
ECN mechanism
Measure
buffer occupancy
React
early
, have a
measured
response
The Solution
For each congestion window:
Count the occurrences of the ECN flag
Update the
congestion average
Adjust
the congestion window as needed
The Solution
The rest of TCP is untouched:
Slow start
Additive increase in window size
Packet loss recovery
Very small patch, only
a few dozen lines
The Results
Two types of benchmarks :
Micro-benchmarks
Benchmarks on a
production cluster
The Results
Micro-Benchmarks - Incast
The Results
Micro-Benchmarks - Queue buildup
The Results
Micro-Benchmarks - Buffer Pressure
Without background traffic
With background traffic
TCP
9.87
46.94
DCTCP
9.17
9.09
The Results
Production Benchmarks
Questions ?
Made with Slides.com