Data Center TCP

or DCTCP [1]

Mohammad Alizadehzy, Albert Greenbergy, David A. Maltzy, Jitendra Padhyey, Parveen Pately, Balaji Prabhakarz, Sudipta Senguptay, Murari Sridharany

 Microsoft Research & Stanford University

SIGCOMM’10, August 30–September 3, 2010, New Delhi, India.

Ming YIN

21 - 11 - 2013

Data Center

A Hot Topic



TCP - Transmission Control Protocol

A Old Topic




A New Topic

For Internet Applications

To solve Impairments of  TCP
because of the Characteristics of  Data Center


Extract the application patterns (characteristics) of Data Center.

Find the performance impairments of TCP used in Data Center.
  • Incast
  • Queue Buildup
  • Buffer Pressure

Propose and Implement DCTCP to solve these impairments.
  • High Burst Tolerance
  • Low Latency
  • High Throughput

Data Center

What is a Data Center?

"A data center is a facility used to house computer systems and associated components, such as telecommunications and storage systems."

- Wikipedia [2]

Many companies are using Data Centers

LINK [4]


High Availability
Infrastructure Scalability
Cost of Connection


Data Center Architecture


Cluster - Rack - Server

Commodity switches

Communications in Data Centers

Storage Query
Short flows

Big Data Computation
Short flows & Long flows

Data Update
Long flows

Data Center Workflow Characterization

A common application structure for Soft real-time applications:

Partition/Aggregate Pattern - Latency Sensitive - Bursty

Figure 1: A sample partition/aggregate pattern

Query Traffic

Partition/Aggregate Pattern - Delay Sensitive

A High-level aggregator 
partitions queries to a large number of 

Mid-level aggregators
partition query over


Background Traffic

Update Flows (Large flows) & Short Messages (Small flows)

Large flows: throughput-sensitive

Small flows: delay-sensitive

Flow Concurrency

Large flows, short flows and bursty query traffic 
co-exist in a data center network.

Figure 2: Distribution of number of concurrent connections. [1]

Performance Impairments

Some backgrounds:

  • TCP rules Data Center traffics, 99.91% [1].

  • Shared memory switches (shallow or deep buffered).

  • Packets are queued for an outgoing interface.

  • A packet is dropped if the buffer of a port is full.

Performance Impairments of TCP

Figure 3: Three main impairments. 
(a) Incast (2) Queue buildup (3) Buffer pressure

Queue - TCP

  • Packets may be dropped when queue is Full
  • Packets are delayed when queue is built up.
  • The buffer size of port is limited since long flows take up shared memory.

  • TCP is greedy in traffic consumption
  • TCP reacts to congestion too late
    • Explicit Congestion Notification
  • TCP drop the window in half when congestion occurs


  1. Small queue occupancies
  2. High throughput

    1. Congestion Experienced (CE) codepoint marking
    2. TCP congestion window size limitation
    3. Multi-bit feedback to Single-bit sequence of marks
    4. Reuse the ECN machinery

Simple Marking at the Switch

A single threshold parameter: K


ECN-Echo at the Receiver

Receiver sets ECN-Echo flag for 
every ACK packets when CE is marked.

Controller at the Sender

An estimation of the fraction of packets that are marked: 𝜶

F is the fraction of packets marked in the last window of data.
g is the weight given to new samples

𝜶 ← ( 1 - g ) × 𝜶 + ( g × F )

Controller at the Sender (cont)

𝜶 means the probability that the queue size is greater than K.

Congestion window size: cwnd 
cwnd ← cwnd × ( 1 - 𝜶 / 2 )


React in proportion to the extent of congestion.
Not its presence.

Table 1: How congestion window is updated with different ECN marks. [4]


1. Incast
  • Large buffer headroom → Burst fits
  • Aggressive marking → Earlier reaction to the congestion

2. Queue buildup
  • Small buffer occupancies → Smaller queueing delay

3. Buffer pressure
  • A port's queue does not grow exceedingly large


Figure 4: The Queue size of TCP and DCTCP in action [1].

Setup: Win 7, Broadcom 1Gbps Switch 

Scenario: 2 long-lived flows, K = 30KB 

Cluster Benchmark Traffic

Figure 4: The measured result of DCTCP and TCP in 
an emulated traffic within 1 Rack of Bing cluster [5].

Implications and recent work

A fluid analysis model of DCTCP
Stability, convergence and fairness [6]

HULL (High-bandwidth Ultra-Low Latency) architecture 
Head room for latency sensitive traffic [7]


DCTCP is proven to handle
  1. High Burst Tolerance
  2. Low Latency
  3. High Throughput

  1. Small changes to TCP → Easy to understand and test
  2. Reuse existing mechanisms → Applicable


[1] Alizadeh, Mohammad, et al. "Data center tcp (dctcp)." ACM SIGCOMM Computer Communication Review 40.4 (2010): 63-74.
[2] "Data center." Wikipedia, The Free Encyclopedia. Wikimedia Foundation, Inc. 22 July 2004. Web. 10 Aug. 2004.
[3] "Map of Uptime Institute Tier Certified Data Centers." Uptime Institute, the Data Center Authority. 15 April 2011. Retrieved from
[4] "Companies." Data Center Knowledge.Retrieved from
[5] Mohammad Alizadeh, "Data Center TCP (DCTCP) ." IETF Talk. Retrieved from
[6] Alizadeh, Mohammad, Adel Javanmard, and Balaji Prabhakar. "Analysis of DCTCP: stability, convergence, and fairness." Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. ACM, 2011.
[7] Alizadeh, Mohammad, et al. "Less is more: Trading a little bandwidth for ultra-low latency in the data center." Proc. of NSDI. 2012.


Please leave your comments to this slide:

Ming YIN

21 - 11 - 2013

Data Center TCP

By Ming YIN

Data Center TCP

  • 5,213