瞿旭民(kevinbird61)
成大資訊所
Apr 26, 2018
@NCKU-IMSLab
Why we need "Heavy-Hitter" detection?
Background on "Heavy-Hitter" detection
Problem formulation
Existed solutions
The new way - HashPipe Algorithm
Inspire from "Space Saving Algorithm", "Count-Min Sketch"
Evaluation of HashPipe
Tuning, accuracy, comparison
Conclusion
What is Heavy-Hitter ( hereinafter called "H.H." ) detection?
Why we need it?
Heavy hitter: "high-volume traffic" in our large network.
Identifying the "H.H." flows in the data plane is important for several applications:
Currently, the measurement on data plane is constrained by line rate (10~100 Gb/s) on target switching hardware, and its limited memory space.
Existing solutions to monitoring heavy items are hard to reach the reasonable correct rate under acceptable resource consumption.
So this paper provide another solution - "HashPipe", to solve this problem from data plane. (P4 + emerging programmable data plane)
Background of H.H detection
Using "per-packet" operation, like hashing the packet headers, increase counter of hashed locations, and find out the minimal or median value among the small group of hashed counters.
But these sketching algorithm do not track the "flow ID" of packets; And hash collisions make it challenging to "invert" (e.g. 轉換) the sketch into the constituent flows and counters.
So we need a solution that can decode keys from hash-based sketches and also it can apply on the fast packet-processing path - "Counter-based algorithms".
Only use O(k) counters to track down "k" heavy flows, and reach the best memory utility in heavy-hitter algorithm.
How does it work:
→ Flow ID = k (incoming packet)
→ Flow ID = k (incoming packet)
→ Flow ID = k (incoming packet)
k | 1 |
---|---|
k | 2 |
---|---|
e | 3 |
k | 3 |
---|---|
e | 3 |
j | 1 |
---|---|
i | 4 |
k | 2 |
---|---|
i | 4 |
HashPipe Algorithm
First thing we know is: HashPipe is derived from "Space saving algorithm", and have some modification on it to fit with switch hardware implementation.
Track the k-heaviest items, maintain a table with m slot with corresponding key and counter, with empty initial state.
And now we can see how to derive this algorithm !
With the knowledge of space saving , we can start some simple modification!
First, we choose a number "d" as the number of sampling, so we can constrain the memory access number into "d".
So that we can have a modified version - "HashParallel", which have some features:
In HashParallel, we have reduce the number of table slots from entire table into "d" to reduce the number of memory access.
However, there still have "d" times for read, and 1 time for write per-packet. And we know that in emerging switch hardware, it's impossible to do several read/write in the same table.
How to solve it? Can we reduce access times to 1 read and 1 write ?
So we need "multiple stages" of hash table !
After we reduce the table slots, now we separate "d" slots into "d" stages to eliminate "multiple read/write", each stage only need to do read/write just for 1 time.
And each object's entry point will be decided by hash function.
So we can divide into 2 steps:
Pipeline with d stages, and each stage dealing with one time read/write
In each stage, packet will be hashed into a location and compare the result.(If in first stage, always do the replacement.)
And the one which has been evicted, will have its key and count value together and go to the next stage.
Evaluation
This paper use two real traces (e.g. network flows):
ISP backbone | Data Center | |
---|---|---|
record date | 2016 | 2010 |
duration | 17 min | 2.5 hr |
detection | 5-tuple | src, dest IP |
total packets | 400 million | 100 million |
Using high frequency packet rate to examine the "correct rate" in 1 second duration from HashPipe applying on both traces.
Packet size = 850 bytes
Network utilization = 30%
switch = 1GHz, 48 ports of 10Gb/s => 20 million packets per second (entire switch), 410 k packets per second ( single link ), and we will set "second" as report duration.
After setting experiment method, we can have some useful index to observe.
3 useful index:
And then we can start evaluation with different aspect mention before!
Evaluation - Tuning HashPipe
(Adjust table size)
In this case, the parameter is the table stages: "d".
Consider under limited memory "m":
If "d" increase, then the number of heaviest key increase, too.
But "m" is fixed, so when "d" increase, the number of slots in each stage will decrease. That cause hash collision more frequently, and also increase the times of duplication.
# p4_14
action doStage1(){
mKeyCarried = ipv4.srcAddr;
mCountCarried = 0;
// Get the location from hash table, and store into mIndex
modify_field_with_hash_based_offset(mIndex,0,stage1Hash,32);
// read the key and value at that location
mKeyTable = flowTracker[mIndex];
mCountTable = packetCount[mIndex];
mValid = validBit[mIndex];
// check for empty location or different key
mKeyTable = (mValid == 0) ? mKeyCarried : mKeyTable;
mDif = (mValid == 0) ? 0 : mKeyTable - mKeyCarried;
// update hash table - write back
flowTracker[mIndex] = ipv4.srcAddr;
packetCount[mIndex] = (mDif == 0) ? mCountTable+1 : 1;
validBit[mIndex] = 1;
// update metadata carried to the next table stage
mKeyCarried = (mDif == 0) ? 0 : mKeyTable;
mCountCarried = (mDif == 0) ? 0 : mCountTable;
}
# p4_14
action doStage2(){
...
mKeyToWrite = (mCountInTable < mCountCarried) ? mKeyCarried : mKeyTable;
flowTracker[mIndex] = (mDif == 0) ? mKeyTable : mKeyToWrite;
mCountToWrite = (mCountTable < mCountCarried) ? mCountCarried : mCountTable;
packetCount[mIndex] = (mDif == 0)? (mCountTable + mCountCarried): mCountToWrite;
mBitToWrite = (mKeyCarried == 0) ? 0 : 1;
validBit[mIndex] = (mValid == 0) ? mBitToWrite : 1;
...
}
Evaluation - Accuracy of HashPipe
Base on the result in Part I, pick d=6 to go on accuracy test.
We can adjust memory usage among different numbers of reported heavy hitters "k".
We can see some phenomena:
1. After 80 KB (About 4500 counters) memory usage, the improvement between different case in ISP backbone have become smaller.
2. And some result happen in 9 KB (About 520 counters) memory usage in Data Center.
Analyze with false positive:
Evaluation -
Compare HashPipe with existing solution
Compare with the methods : sampling and sketching (which also can apply on switch hardware to calculate the counters)
Experiment method:
(Original sketching algorithm doesn't provide flow ID of packet for tracking)
Run those different method with same memory usage : 26 KB
The estimation error made by Simple & Hold:
The estimation error made by Count-min Sketch:
Evaluation -
Compare HashPipe with idealized schemes
Compare with "Space saving algorithm" and "HashParallel".
There are some reason can cause the performance of "Space saving" will outperform "HashPipe":
Using two scenario: k=60,150 with different memory size to compare their false negative rate.
Among these two scenario, we can see two features:
Why the ability of fetching heavy items in space saving is poorer than HashPipe?
For k=150, we using a memory sizes (m = 1200) smaller than thresholds ( m = 2500), and incrementing a counter for each packet may result in catching up to a heavy flow.
Which leading to significant false positives, and may evicting truly heavy flows from the table.
We use the parameter : k=300, m=2400 as our experiment environment.
Due to duplicated key occupying the table, which cause HashPipe having a high false negative rate.
Conclusion