DCTCP

Data-Center TCP

Microsoft Research & Stanford

-------

Maxime DIDIER
Paul CHAROUSSET
Yannick PÉROUX

The Problem


  • Real-World problem:

    How to avoid congestion in Data-Centers ?

  • Make use of existing commodity hardware
  • We can't reinvent the wheel !
  • Tweaking TCP in order to avoid congestion

The Problem


  • Their setup:
    • 6000 servers
    • Commodity hardware
    • 2 kinds of flow:
      • Short-lived
        • Distributed tasks
        • Low latency
        • Tasks dropped after a timeout
      • Long-lived
        • Background tasks
        • Large volume
        • Less critical

The Problem


  • Background tasks -> ~90% of the traffic
  • Fill-up the switches' buffers

Common solutions :

  • Use Round-Trip Time
  • Active Queue Management

The Solution


  • Keep buffer usage low
    • Avoid queue buildup
    • Avoid buffer pressure
    • Plenty of room to handle incast bursts

The Solution


  • Make a better use of the ECN mechanism
  • Measure buffer occupancy
  • React early, have a measured response

The Solution


  • For each congestion window:
    • Count the occurrences of the ECN flag
    • Update the congestion average
    • Adjust the congestion window as needed

The Solution


  • The rest of TCP is untouched:
    • Slow start
    • Additive increase in window size
    • Packet loss recovery


Very small patch, only a few dozen lines

The Results


  • Two types of benchmarks :
    • Micro-benchmarks
    • Benchmarks on a production cluster

The Results

Micro-Benchmarks - Incast


The Results

Micro-Benchmarks - Queue buildup


The Results

Micro-Benchmarks - Buffer Pressure




Without background traffic With background traffic
TCP 9.87 46.94
DCTCP 9.17 9.09

The Results

Production Benchmarks








Questions ?

DCTCP - Data Center TCP

By Yannick Péroux

DCTCP - Data Center TCP

  • 2,108