Lock-Free Programming in a DB Connection Pool
whoami
- Shay Rojansky
- Long-time Linux/open source developer/enthusiast (installed Slackware somewhere in 1994/5)
- Lead dev of Npgsql
- .NET driver for PostgreSQL (took over in 2012)
- Provider for Entity Framework Core
- Joined Microsoft's data access team in January
whois Npgsql
- Open-source .NET driver for PostgreSQL
- Implements ADO.NET (System.Data.Common)
- Supports fully async (but also sync)
- Fully binary driver
- Fully-featured (PostgreSQL special types, data import/export API, notification support...)
Focus on high performance...
"Database drivers are boring"
- Everyone, including me in 2012
Perf in .NET Core 2.1 and Npgsql 4.0
TechEmpower Fortunes results, round 17
Lock-Free Database Connection Pool
What is a DB connection pool?
- Physical DB connections are very expensive to create - especially in PostgreSQL (process spawned in the backend).
- If we open/close all the time, the app grinds to a halt.
- So the driver includes a pool.
- Pooled Open() and Close() must be very efficient (connection sometimes held for very short, e.g. to service one web request)
- It's also the one place where synchronization happens: once a connection is taken out there's no thread-safety.
Identifying Contention Bottlenecks
- Low local CPU usage (was only 91%)
- Network bandwidth far below saturation (10Gbps Ethernet, very little iowait)
- No resource starvation on the PostgreSQL side
- Profiler will show Monitor.Enter()
- Try reducing lock scopes if possible (already optimized)
Lock-Free Programming
- Lock-free programming means, well, no locking.
- All operations are interleaved, all the time, everywhere.
- "Interlocked" CPU instructions are used to perform complex operations atomically.
- It's really, really hard to get right.
Simple Rent
Connection[] _idle = new Connection[PoolSize];
public Connection Rent()
{
for (var i = 0; _idle.Length; i++)
{
// If the slot contains null, move to the next slot.
if (_idle[i] == null)
continue;
// If we saw a connection in this slot, atomically exchange it with a null.
// Either we get a connection out which we can use, or we get null because
// someone has taken it in the meantime. Either way put a null in its place.
var conn = Interlocked.Exchange(ref _idle[i], null);
if (conn != null)
return conn;
}
// No idle connection was found, create a new one
return new Connection(...);
}
Simple Return
public void Return(Connection conn)
{
for (var i = 0; i < _idle.Length; i++)
if (Interlocked.CompareExchange(ref _idle[i], conn, null) == null)
return;
// No free slot was found, close the connection...
conn.Close();
}
This simple implementation was used as a proof-of-concept.
Let's take it up a notch
- Since physical connections are very expensive, we need to enforce an upper limit (think multiple app instances, one DB backend).
- Once the max limit has been reached, wait until a connection is released.
- Release waiters in FIFO, otherwise some get starved.
- So add a waiting queue next to our idle list: if no idle conn was found, enqueue a TaskCompletionSource.
- When releasing, first check the waiting queue to release any pending attempts. Otherwise return to the idle list.
Two data structures, no locks
- One idle list, one waiting queue. By the time we finish updating one, the other may have changed.
- Example: rent attempt enters waiting queue, at the same time as a release puts a connection back in the idle list. Rent attempt waits forever.
- Need a way to have a single, atomically-modified state (how many busy, idle, waiting)
Atomically manipulate state
[StructLayout(LayoutKind.Explicit)]
internal struct PoolState
{
[FieldOffset(0)]
internal short Idle;
[FieldOffset(2)]
internal short Busy;
[FieldOffset(4)]
internal short Waiting;
[FieldOffset(0)]
internal long All;
}
// Create a local copy of the pool state, in the stack
PoolState newState;
newState.All = _state.All;
// Modify it
newState.Idle--;
newState.Busy++;
// Commit the changes
_state.All = newState.All;
Problem: we have to have multiple counters (idle, busy, waiting), and manipulate them atomically.
But two operations may be racing, and we lose a change...
Handling race conditions
var sw = new SpinWait();
while (true)
{
// Create a local copy of the pool state, in the stack
PoolState oldState;
oldState.All = Volatile.Read(ref _state.All); // Volatile prevents unwanted "optimization"
var newState = oldState;
// Modify it
newState.Idle--;
newState.Busy++;
// Try to commit the changes
if (Interlocked.CompareAndExchange(ref _state.All, newState.All, oldState.All) == oldState.All)
break;
// Our attempt to increment the busy count failed, Loop again and retry.
sw.SpinOnce();
}
- This is how a lot of lock-free code looks like: try until success.
- SpinWait is a nice tool for spinning, and will automatically context switch after some time.
- Beware of the ABA problem!
Very subtle issues
- An open attempt is creating a new connection
(idle == 0, total < max). It does busy++. - But a network error occurred during open. State has already been updated (busy++), need to decrement it.
- But since we did busy++, maybe busy was max; someone may have gone into the waiting queue - so we need to un-wait them.
Be careful
- Always keep perf in mind, but resist the temptation to over- (or pre-) optimize.
- Reducing roundtrips and preparing commands is much more important than reducing an allocation or a doing fancy low-level stuff.
- (even if it is much less sexy)
Conclusion
https://medium.com/@tylerneely/fear-and-loathing-in-lock-free-programming-7158b1cdd50c
8% perf increase
But probably the most dangerous 8% I've ever done.
Some Final Remarks
The ADO.NET API
- The API hasn't received significant attention in a long time.
- Batching API
- Generic parameters to avoid boxing
- Better support for prepared statements
- However, it's still possible to do remarkably well on ADO.NET (no need to throw it out entirely).
- The TechEmpower fortunes benchmark is one of the more realistic ones (e.g. involves DB)
- But still, how many webapps out there send identical simple queries over a 10Gbps link to PostgreSQL, for a small dataset that is cached in memory?
- Reducing contention is important as number of cores increases
- Performance is usually about tradeoffs: code is more brittle/less safe (e.g. lock-free)
Is this all useful in the real world?
- No good answer, especially difficult to decide when you're a library...
- Your spouse & family may hate you, but...
- You learn a ton of stuff and have lots of fun in the process.
So, is this all useful in the real world??
Thank you!
https://slides.com/shayrojansky/netherlands-dncse-2019-6-22
http://npgsql.org, https://github.com/npgsql/npgsql
Shay Rojansky, roji@roji.org, @shayrojansky
Netherlands .NET Core summer event presentation
By Shay Rojansky
Netherlands .NET Core summer event presentation
- 1,555