Shay Rojansky
Long-time Linux/open source developer/enthusiast (installed Slackware somewhere in 1994/5)
Owner of Npgsql
PostgreSQL driver
Entity Framework (O/RM) providers
Microsoft MVP (for open source .NET PostgreSQL work)
.NET Core: cross-platform .NET.
MIT license, developed on github.
Microsoft has the most open source contributors on GitHub (Octoverse 2016)
ASP.NET Core in top 10 TechEmpower plaintext benchmark, on Linux.
Introduction
Npgsql Driver Features
Entity Framework Core – O/RM Features for PostgreSQL
Driver/Protocol Internals
Closing Words
Features and Performance
using (var conn = new NpgsqlConnection("Host=localhost;Database=mydatabase;..."))
{
conn.Open();
using (var cmd = new NpgsqlCommand("SELECT 'hello world'", conn))
Console.WriteLine(cmd.ExecuteScalar());
using (var cmd = new NpgsqlCommand("SELECT * FROM foo", conn))
using (var reader = cmd.ExecuteReader())
while (reader.Read())
Console.WriteLine(reader.GetString(0));
}
Layers (e.g. Dapper) make this nicer and more idiomatic
PostgreSQL supports a rich set of non-standard data types:
Domain-specific simple types: inet, macaddr, circle…
Array
Range
Enum
Composite
New types can be added via extensions
var mac = reader.GetFieldValue<PhysicalAddress>(0);
var rng = reader.GetFieldValue<NpgsqlRange<int>(0);
var arr = reader.GetFieldValue<int[]>(0);
// PG: CREATE TYPE mood AS ENUM ('happy', 'sad', 'ok')
enum Mood { Happy, Sad, OK }
NpgsqlConnection.MapEnumGlobally<Mood>();
var mood = reader.GetFieldValue<Mood>(0);
// PG: CREATE TYPE my_composite AS (x int, some_text text)
class MyComposite {
public int x { get; set; }
public string SomeText { get; set; }
}
NpgsqlConnection.MapCompositeGlobally<MyComposite>();
var comp = reader.GetFieldValue<MyComposite>(0);
var cmd = new NpgsqlCommand("UPDATE x SET y=8 WHERE z=9", conn);
cmd.ExecuteNonQuery(); // The thread is stuck
Console.WriteLine("Done");
Unfortunately, async programming usually sucks (NodeJS):
client.connect(function (err) {
if (err) throw err;
// execute a query on our database
client.query('SELECT $1::text as name', ['brianc'], function (err, result) {
if (err) throw err;
// just print the result to the console
console.log(result.rows[0]); // outputs: { name: 'brianc' }
// disconnect the client
client.end(function (err) {
if (err) throw err;
});
});
});
C# supports async operations in the language:
async Task FooAsync() {
var cmd = new NpgsqlCommand("UPDATE x SET y=8 WHERE z=9", conn);
await cmd.ExecuteNonQueryAsync();
Console.WriteLine("Done");
}
In PostgreSQL, preparing statements is important:
SELECT table0.data+table1.data+table2.data+table3.data+table4.data
FROM table0
JOIN table1 ON table1.id=table0.id
JOIN table2 ON table2.id=table1.id
JOIN table3 ON table3.id=table2.id
JOIN table4 ON table4.id=table3.id
Unprepared:
TPS: 953
Prepared:
TPS: 8722
Prepared statements + short-lived connections =
In Npgsql 3.2, prepared statements are persisted:
using (var conn = new NpgsqlConnection(...))
using (var cmd = new NpgsqlCommand("<some_sql>", conn) {
conn.Open();
conn.Prepare();
// The first time on this physical connection, preparation takes place.
// After that, the same prepared statement is reused.
conn.ExecuteNonQuery();
}
Prepared statements + O/RMs =
Npgsql 3.2 includes automatic preparation: all the benefits, no need to call Prepare().
Statement is prepared after being executed 5 times.
We eject least-recently used statements to make room for new ones.
NPGSQL
POSTGRESQL
INSERT INTO...
OK
INSERT INTO...
OK
NPGSQL
POSTGRESQL
INSERT INTO...; INSERT INTO ...
OK; OK
The performance improvement is dramatic.
The further away the server is, the more the speedup (think cloud).
Supported for all CRUD operations, not just insert.
For bulk insert, binary COPY is an even better option.
Entity Framework Core takes advantage of this.
var cmd = new NpgsqlCommand("INSERT INTO ...; INSERT INTO ...", conn);
cmd.ExecuteNonQuery();
Buffer resultsets? Rows? Columns??
JDBC/psycopg2 buffer resultsets by default
Npgsql has fixed send/receive buffers
.NET has API for sequential column access
Read/write should never copy needlessly
Fully binary data transfer
Binary COPY
Waiting for notifications
Kerberos login (Windows/Linux), SSL/TLS
Distributed transactions
Exposing PostgreSQL-Specific Features in a Generic O/RM
Entity Framework (currently 6.x) is Microsoft’s O/RM, first released in 2008.
EF Core is a total rewrite, focusing on cross-platform/cross-database support, a lighter, more modular design and NoSQL support.
Released with .NET Core, MIT-licensed, works on Linux/MacOS…
Microsoft maintains providers for SQL Server and Sqlite, the community does others.
Disadvantages:
No compile-time syntax checking.
If some column name changes, boom.
Not such a great advantage over hand-coding SQL.
Some string language (e.g. Hibernate HQL):
String hql = "FROM Employee E WHERE E.id > 10 ORDER BY E.salary DESC";
Query query = session.createQuery(hql);
List results = query.list();
Notice the ‘and_’ – very important (can’t use Python and).
Using an API (Python’s SQLAlchemy):
for name, in session.query(User.name).\
... filter(and_(User.name.like('Shay%'), User.Age > 18)).\
... order_by(User.id)
Hibernate's Criteria API:
List cats = sess.createCriteria(Cat.class)
.add( Restrictions.like("name", "Fritz%") )
.add( Restrictions.or(
Restrictions.eq( "age", new Integer(0) ),
Restrictions.isNull("age")
) )
.list();
C# has Language Integrated Query built into the language:
List<int> list = ...;
var nums = list
.Where(i => i % 2 == 0)
.OrderBy(i => i)
.Take(10)
.Select(i => i * 2)
.ToList();
var myBlogs =
from b in blogs
where b.Name.StartsWith("B")
orderby b.LastUpdate
select b;
The same building blocks work on database tables:
var names = ctx.Users
.Where(u => u.Name.Contains("Shay") && u.Age > 18)
.OrderBy(u => u.Id)
.ToList();
There’s an Npgsql provider for Entity Framework Core.
A provider is needed to support PostgreSQL-specific things: SQL dialect, types, etc.
Since O/RMs are a generic abstraction layer, they hide database-specific features.
But EF Core makes it possible to expose PostgreSQL-specific functionality.
Mapping totally PostgreSQL-specific types:
public class Blog
{
public int Id { get; set; }
public string Name { get; set; }
public IPAddress Address { get; set; }
public string[] Tags { get; set; }
}
Simple
PostgreSQL-specific
type
Not so simple
PostgreSQL-specific
type
PostgreSQL has great built-in support for regular expressions:
SELECT * FROM blogs WHERE name ~ '^\+?[\d\s]+\(?[\d\s]{10,}\)?$'
Wouldn’t it be nice to expose this in an O/RM?
var myBlogs =
from b in blogs
where Regex.IsMatch(b.Name, "a+b")
select b;
This code will run as-is against another database, but will be client-evaluated.
This transfers all blog details over the network.
modelBuilder
.Entity<Blog>()
.HasIndex(e => e.SomeField)
.ForNpgsqlHasMethod("gin");
EF Core allows arbitrary "annotations" on your model, which translate into DDL.
Creating your model:
EF Core development happens on github.
Work on the PostgreSQL provider started very early, when everything was in early beta.
Input from PostgreSQL helped shape EF Core – the Microsoft team was very accessible and extremely enthusiastic/receptive.
In addition to discussions, many PRs have made into the core.
NPGSQL
POSTGRESQL
BEGIN
OK
INSERT INTO...
OK
var tx = conn.BeginTransaction(); // Sends BEGIN
// do stuff in the transaction
BeginTransaction() simply writes the BEGIN into the write buffer, but doesn’t send. It will be sent with the next query.
Another example: when closing a (pooled) connection, we need to reset its state.
This is also prepended, and will be sent when the connection is “reopened” and a query is sent.
NPGSQL
Parse
Bind
Describe
Execute
Sync
Parse
Complete
Bind
Complete
Row
Description
Data
Row
Data
Row
Ready For
Query
POSTGRESQL
NPGSQL
Bind1
POSTGRESQL
Exec1
Bind1
Exec1
Sync
BC1
Data1
Data1
BC2
Data2
Data2
RFQ
PG starts sending result here
If these are big...
One option is to use non-blocking I/O. We send protocol messages until a send “would block”. We switch to reading, and send the rest later.
Unfortunately, .NET I/O doesn’t support non-blocking I/O very well.
Instead, later protocol messages are sent asynchronously – sending and receiving happens at the same time.
Microsoft isn’t the bad guy anymore – full commitment to open-source and cross-platform.
.NET Core is a real option, whatever OS you’re using.
C# (and other .NET languages) is a very modern language that continues to evolve very quickly.
PostgreSQL is fully supported in .NET, Npgsql provides high-performance and feature-rich access.
@shayrojansky
roji@roji.org
https://slides.com/shayrojansky/npgsql-2017-01-tlv