Npgsql -
PostgreSQL for .NET

whoami

  • Shay Rojansky

  • Long-time Linux/open source developer/enthusiast (installed Slackware somewhere in 1994/5)

  • Owner of Npgsql

    • PostgreSQL driver

    • Entity Framework (O/RM) providers

  • Microsoft MVP (for open source .NET PostgreSQL work)

whois microsoft

Open-source, not just Windows anymore

  • .NET Core: cross-platform .NET.

  • MIT license, developed on github.

  • Microsoft has the most open source contributors on GitHub (Octoverse 2016)

  • ASP.NET Core in top 10 TechEmpower plaintext benchmark, on Linux.

Contents

  • Introduction

  • Npgsql Driver Features

  • Entity Framework Core – O/RM Features for PostgreSQL

  • Driver/Protocol Internals

  • Closing Words

PostgreSQL Driver for .NET

Features and Performance

Hello World

using (var conn = new NpgsqlConnection("Host=localhost;Database=mydatabase;..."))
{
    conn.Open();
    using (var cmd = new NpgsqlCommand("SELECT 'hello world'", conn))
        Console.WriteLine(cmd.ExecuteScalar());

    using (var cmd = new NpgsqlCommand("SELECT * FROM foo", conn))
    using (var reader = cmd.ExecuteReader())
        while (reader.Read())
            Console.WriteLine(reader.GetString(0));
}

Layers (e.g. Dapper) make this nicer and more idiomatic

Extended Type Support

  • PostgreSQL supports a rich set of non-standard data types:

    • Domain-specific simple types: inet, macaddr, circle…

    • Array

    • Range

    • Enum

    • Composite

  • New types can be added via extensions

Extended Type Support

var mac = reader.GetFieldValue<PhysicalAddress>(0); var rng = reader.GetFieldValue<NpgsqlRange<int>(0); var arr = reader.GetFieldValue<int[]>(0);
// PG: CREATE TYPE mood AS ENUM ('happy', 'sad', 'ok') enum Mood { Happy, Sad, OK } NpgsqlConnection.MapEnumGlobally<Mood>(); var mood = reader.GetFieldValue<Mood>(0);
// PG: CREATE TYPE my_composite AS (x int, some_text text) class MyComposite { public int x { get; set; } public string SomeText { get; set; } } NpgsqlConnection.MapCompositeGlobally<MyComposite>(); var comp = reader.GetFieldValue<MyComposite>(0);

Asynchronous Programming

  • In sync I/O, your thread blocks while you wait for your DB query to execute:
var cmd = new NpgsqlCommand("UPDATE x SET y=8 WHERE z=9", conn);
cmd.ExecuteNonQuery();  // The thread is stuck
Console.WriteLine("Done");
  • Bad for scalability – lots of waiting threads.
  • UI thread blocked -> freeze.
  • Async releases your thread to do other things.

Callback-Based Async

  • Unfortunately, async programming usually sucks (NodeJS):

client.connect(function (err) {
  if (err) throw err;

  // execute a query on our database
  client.query('SELECT $1::text as name', ['brianc'], function (err, result) {
    if (err) throw err;

    // just print the result to the console
    console.log(result.rows[0]); // outputs: { name: 'brianc' }

    // disconnect the client
    client.end(function (err) {
      if (err) throw err;
    });
  });
});

Language Support for Async

  • C# supports async operations in the language:

async Task FooAsync() {
	var cmd = new NpgsqlCommand("UPDATE x SET y=8 WHERE z=9", conn);
	await cmd.ExecuteNonQueryAsync();
	Console.WriteLine("Done");
}
  • The C# compiler will rewrite this function with callbacks under the hood.
  • This gives the benefits of async, while programming like it’s sync.

Prepared Statements

  • In PostgreSQL, preparing statements is important:

SELECT table0.data+table1.data+table2.data+table3.data+table4.data
FROM table0
JOIN table1 ON table1.id=table0.id
JOIN table2 ON table2.id=table1.id
JOIN table3 ON table3.id=table2.id
JOIN table4 ON table4.id=table3.id

Unprepared:

TPS: 953

Prepared:

TPS: 8722

Persisting Prepared Statements

  • Prepared statements + short-lived connections =

  • In Npgsql 3.2, prepared statements are persisted:

using (var conn = new NpgsqlConnection(...))
using (var cmd = new NpgsqlCommand("<some_sql>", conn) {
    conn.Open();
    conn.Prepare();
    // The first time on this physical connection, preparation takes place.
    // After that, the same prepared statement is reused.
    conn.ExecuteNonQuery();
}

Automatic Preparation

  • Prepared statements + O/RMs =

  • Npgsql 3.2 includes automatic preparation: all the benefits, no need to call Prepare().

  • Statement is prepared after being executed 5 times.

  • We eject least-recently used statements to make room for new ones.

Not Preparing Your Statements

Roundtrip per Statement

NPGSQL

POSTGRESQL

INSERT INTO...

OK

INSERT INTO...

OK

Batching/Pipelining

NPGSQL

POSTGRESQL

INSERT INTO...; INSERT INTO ...

OK; OK

Batching / Pipelining

  • The performance improvement is dramatic.

  • The further away the server is, the more the speedup (think cloud).

  • Supported for all CRUD operations, not just insert.

  • For bulk insert, binary COPY is an even better option.

  • Entity Framework Core takes advantage of this.

var cmd = new NpgsqlCommand("INSERT INTO ...; INSERT INTO ...", conn);
cmd.ExecuteNonQuery();

Buffering

  • Buffer resultsets? Rows? Columns??

  • JDBC/psycopg2 buffer resultsets by default

  • Npgsql has fixed send/receive buffers

  • .NET has API for sequential column access

  • Read/write should never copy needlessly

Other Features

  • Fully binary data transfer

  • Binary COPY

  • Waiting for notifications

  • Kerberos login (Windows/Linux), SSL/TLS

  • Distributed transactions

Entity Framework Core PostgreSQL Support

Exposing PostgreSQL-Specific Features in a Generic O/RM

What is EF Core?

  • Entity Framework (currently 6.x) is Microsoft’s O/RM, first released in 2008.

  • EF Core is a total rewrite, focusing on cross-platform/cross-database support, a lighter, more modular design and NoSQL support.

  • Released with .NET Core, MIT-licensed, works on Linux/MacOS…

  • Microsoft maintains providers for SQL Server and Sqlite, the community does others.

Querying in O/RMs - Not Fun

  • Disadvantages:

    • No compile-time syntax checking.

    • If some column name changes, boom.

    • Not such a great advantage over hand-coding SQL.

Some string language (e.g. Hibernate HQL):

String hql = "FROM Employee E WHERE E.id > 10 ORDER BY E.salary DESC";
Query query = session.createQuery(hql);
List results = query.list();

Better But Not Amazing

Notice the ‘and_’ – very important (can’t use Python and).

Using an API (Python’s SQLAlchemy):

for name, in session.query(User.name).\
...     filter(and_(User.name.like('Shay%'), User.Age > 18)).\
...     order_by(User.id)

?!?!

Hibernate's Criteria API:

List cats = sess.createCriteria(Cat.class)
    .add( Restrictions.like("name", "Fritz%") )
    .add( Restrictions.or(
        Restrictions.eq( "age", new Integer(0) ),
        Restrictions.isNull("age")
    ) )
    .list();

C# and LINQ

C# has Language Integrated Query built into the language:

List<int> list = ...;
var nums = list
   .Where(i => i % 2 == 0)
   .OrderBy(i => i)
   .Take(10)
   .Select(i => i * 2)
   .ToList();
  • The language already has the basic building blocks.
  • Functional programming
var myBlogs =
    from b in blogs
    where b.Name.StartsWith("B")
    orderby b.LastUpdate
    select b;

LINQ and Entity Framework

The same building blocks work on database tables:

var names = ctx.Users
    .Where(u => u.Name.Contains("Shay") && u.Age > 18)
    .OrderBy(u => u.Id)
    .ToList();
  • Entity Framework receives an expression tree, which it then translates to SQL.
  • Query database tables using the same code we use with objects.

... back to PostgreSQL

  • There’s an Npgsql provider for Entity Framework Core.

  • A provider is needed to support PostgreSQL-specific things: SQL dialect, types, etc.

  • Since O/RMs are a generic abstraction layer, they hide database-specific features.

  • But EF Core makes it possible to expose PostgreSQL-specific functionality.

Type Mapping

Mapping totally PostgreSQL-specific types:

public class Blog
{
	public int Id { get; set; }
	public string Name { get; set; }
	public IPAddress Address { get; set; }
	public string[] Tags { get; set; }
}
            

Simple

PostgreSQL-specific

type

Not so simple

PostgreSQL-specific

type

Translating Method Calls

PostgreSQL has great built-in support for regular expressions:

SELECT * FROM blogs WHERE name ~ '^\+?[\d\s]+\(?[\d\s]{10,}\)?$'

Wouldn’t it be nice to expose this in an O/RM?

var myBlogs =
    from b in blogs
    where Regex.IsMatch(b.Name, "a+b")
    select b;

This code will run as-is against another database, but will be client-evaluated.

This transfers all blog details over the network.

PG Customizations of DB Objects

modelBuilder
    .Entity<Blog>()
    .HasIndex(e => e.SomeField)
    .ForNpgsqlHasMethod("gin");

EF Core allows arbitrary "annotations" on your model, which translate into DDL.

Creating your model:

A Few Words on Collaboration

  • EF Core development happens on github.

  • Work on the PostgreSQL provider started very early, when everything was in early beta.

  • Input from PostgreSQL helped shape EF Core – the Microsoft team was very accessible and extremely enthusiastic/receptive.

  • In addition to discussions, many PRs have made into the core.

Adventures in Database Driver Development

Minimizing Roundtrips

NPGSQL

POSTGRESQL

BEGIN

OK

INSERT INTO...

OK

var tx = conn.BeginTransaction();   // Sends BEGIN
// do stuff in the transaction

Prepend, prepend

  • BeginTransaction() simply writes the BEGIN into the write buffer, but doesn’t send. It will be sent with the next query.

  • Another example: when closing a (pooled) connection, we need to reset its state.

  • This is also prepended, and will be sent when the connection is “reopened” and a query is sent.

The PostgreSQL Protocol

NPGSQL

Parse

Bind

Describe

Execute

Sync

Parse

Complete

Bind

Complete

Row

Description

Data

Row

Data

Row

Ready For

Query

POSTGRESQL

Batching

NPGSQL

Bind1

POSTGRESQL

Exec1

Bind1

Exec1

Sync

BC1

Data1

Data1

BC2

Data2

Data2

RFQ

PG starts sending result here

If these are big...

Deadlocks are Fun

  • One option is to use non-blocking I/O. We send protocol messages until a send “would block”. We switch to reading, and send the rest later.

  • Unfortunately, .NET I/O doesn’t support non-blocking I/O very well.

  • Instead, later protocol messages are sent asynchronously – sending and receiving happens at the same time.

Summary

  • Microsoft isn’t the bad guy anymore – full commitment to open-source and cross-platform.

  • .NET Core is a real option, whatever OS you’re using.

  • C# (and other .NET languages) is a very modern language that continues to evolve very quickly.

  • PostgreSQL is fully supported in .NET, Npgsql provides high-performance and feature-rich access.

@shayrojansky

roji@roji.org

https://slides.com/shayrojansky/npgsql-2017-01-tlv

Made with Slides.com