Managing PostgreSQL with Ansible

Gülçin Yıldırım

PGConf EU 2015, Vienna

select * from me;

Postgres DBA @ 2ndQuadrant

Studying MSc Comp. & Systems Eng. @ Tallinn University of Technology

Studied BSc Maths Eng. @ Yildiz Technical University

Writes blog on 2ndQuadrant blog

Does some childish paintings

Loves independent films



Skype: gulcin2ndq

Github: gulcin


What is Ansible?

Simple, agentless and powerful open source IT automation tool

  • Provisioning
  • Configuration Management
  • Application Deployment
  • Continuous Delivery
  • Security & Compliance
  • Orchestration

Why so popular?

Why now?

The technology: APIs, Cloud Management

Nearly every major service has an API these days.

The need: More and more products are now composed of distributed components. More moving parts. They need to be managed.

The practise: It is now common to automate builds, tests, deployments, integration. Infrastructure is the next thing to automate.

What are the options?

What are the options?

Why Ansible?

  • Agent-less architecture (no agent is required, everything is done by using SSH, ssh for communication)
  • No centralised server, no client-side agents
  • SSH based, no PKI
  • Configuration as data, not code (YAML files)
  • Batteries included (community, playbooks, modules, roles, tasks, documentation)
  • Full configuration management, deployment and orchestration
  • Custom modules can be written in any programming language.
  • JSON input/output is sufficient to integrate a module into Ansible.

Life before Ansible

Multiple ssh, control panels, editing config files manually.

Let's install Ansible

Let's install Ansible

Debian or Ubuntu:

apt-get install python-dev python-setuptools
easy_install pip
pip install ansible boto

Mac OS X:

sudo easy_install pip
sudo pip install ansible boto

Code of this talk

Before we start discussing Ansible in more detail,
please check the repo that I created for this talk:


In this example, we will:

  • Provision Amazon VPC and EC2 instances
  • Install latest PostgreSQL packages
  • Setup a streaming replication with 1 master and 2 standbys
  • Add a new standby server

Configuration management

Why do we need a better & modern approach?

  • Manual tasks are error prone
  • Very difficult to setup test environments before changing/implementing something big
  • Too many servers require too many tasks to be repeated on each of them
  • Since it is a manual process, some servers may go out of sync, they may become outdated or even misconfigured
  • Leads to heterogenous environment instead of a homogenous & more managable one


  • Automated:
    Simplifies system administrators job by freeing them from most of manual interventions.
  • Idempotent:
    Ensures that the system is in the state we expect it to be.
  • Agentless:
    Every resource is managed by SSH or APIs (like the Amazon API).
  • Feature-rich:
    Provides lots of modules from file handling to virtual machine provisioning.

Application deployment

  • Time & Bandwidth Efficient:
    Synchronizes only new or changed files.
  • Full Process Handling:
    Ansible can remove servers from load-balancers and stop services before transfers. Then it can restore them all back.
  • Zero-Downtime Deployment:
    It can deploy on a subset of servers and then move on to the next set to ensure that the system is always up and running.
  • Same solution can be run on developer machines, production environment, staging environment, testing environment, etc...

Continuous delivery

Continuous delivery means adopting a fast and simple approach for releases.


Ansible helps us with two main aspects:

  1. Enables frequent releases without downtimes
  2. Requires as little human intervention as possible


  • It can automate any task from provisioning servers to configuring services to become production ready.
  • Works very well in-front of a continuous integration system.

Security and compliance

You need to be sure about security of your:

  • Systems / Services / Software
  • Data
  • Customer's data

Once the security policy has been set, Ansible can be used to turn the system back into the compliance mode, instantly.

Ansible is easy-to-use whether it’s about:

  • Setting up firewall rules
  • Locking down users and groups
  • Applying custom security policies


Ansible ensures that all tasks are executed in the proper order.

This makes orchestration of complex multi-tier deployments easier.


For example, considering the deployment of a software stack, Ansible makes sure that:

  • The networking is setup before resources are communicating
  • PostgreSQL servers are up and running before app deployment
  • Containers are launched before the services are taken online
  • Application is deployed before the load balancers are activated


We will see this in action with our demo.

Building Blocks

An Ansible solution is composed of one or more items listed on the left side.


Typically, our solutions executes tasks for an inventoryutilizing some modules, using or populating some variables, processing some file templates, in a playbook, which can be organized in roles.


Let's see each of them in detail.

Building Blocks - Inventory

  • Tells Ansible about hosts it should manage
    • Hostnames, IPs, ports, SSH parameters
    • Server specific variables
  • 2 common ways to provide Ansible an inventory:
    • Static inventory: A flat INI file (e.g., hosts.ini)
    • Dynamic inventory: An application returning a JSON data (e.g.,: for Amazon EC2)
  • Hosts are grouped. They can belong to multiple groups
  • Groups can also belong to multiple groups

Building Blocks - Inventory

Below inventory file in INI format contains 3 hosts under 2 groups.

There is also a 3rd group which contains other groups and all of the hosts.

# "master" group with 1 host
postgresql-master        ansible_ssh_host=     ansible_ssh_user=ubuntu

# "standby" group with 2 hosts
postgresql-standby-01    ansible_ssh_host=    ansible_ssh_user=ubuntu
postgresql-standby-02    ansible_ssh_host=    ansible_ssh_user=ubuntu

# the "replication" group contains both "master" and "standbys" groups

Building Blocks - Module

  • Modules provide Ansible means to control or manage resources on local or remote servers.
  • They perform a variety of functions. For example a module may be responsible for rebooting a machine or it can simply display a message on the screen.
  • Ansible allows users to write their own modules and also provides out-of-the-box core or extras modules.
    • Core: Maintained by the core Ansible team and always shipped with Ansible.
    • Extras: Maintained by the community. Might be shipped separately in the future.

Building Blocks - Module

Some of the most commonly used modules are:


  • File handling: file, stat, copy, template
  • Remote execution: command, shell
  • Service management: service
  • Package management: apt, yum, bsd, ports
  • Source control systems: git, subversion

Building Blocks - Task

Tasks are responsible for calling a module with a specific set of parameters.


Each Ansible task contains:

  • a descriptive name [optional]
  • a module to be called
  • module parameters
  • pre/post-conditions [optional]
  • processing directives [optional]


They allow us to call Ansible modules and pass information to consecutive tasks.

Building Blocks - Task

Below task invokes the "file" module by providing 4 parameters. It ensures 3 conditions are true:


  • /var/lib/postgresql exists as a directory
  • owner of /var/lib/postgresql is "postgres"
  • group of /var/lib/postgresql is "postgres"

If it doesn't exist, Ansible creates the directory and assigns owner & group. If only the owner is different, Ansible makes it "postgres".

- name: Ensure the data folder has right ownership
  file: path="/var/lib/postgresql" state=directory owner=postgres group=postgres

Building Blocks - Task

Following example shows relationships between tasks. The first task checks if a device exists and the second task mounts the device depending on the result from the first task.

Please note "register" and "when" keywords.

- name: Check if the data volume partition exists
  stat: path=/dev/sdc1
  register: partition

- name: Ensure the PostgreSQL data volume is mounted
  mount: src=/dev/sdc1 name="/var/lib/postgresql/9.4" fstype=ext4 state=mounted
  when: partition.stat.exists is defined and partition.stat.exists

Building Blocks - Variable

Variables in Ansible are very useful for reusing information. Sources for variables are:

  • Inventory: We can assign variables to hosts or groups (group vars, host vars).
  • YAML files: We can include files containing variables.
  • Task results: Result of a task can be assigned to a variable using the register keyword as shown in the previous slide.
  • Playbooks: We can define variables in Ansible playbooks (more on that later).
  • Command line: (-e means extra variable // -e  "uservar=gulcin")

Building Blocks - Variable

There is also discovered variables (facts) and they can be found by using setup module:


  • All of the output in json: bios_version, architecture, default_ipv4_address, ansible_os_family, etc.

You can use variables in tasks, templates themselves and you can iterate over using with_type functions.


ansible -i hosts.ini -m setup hostname
- name: Ensure PostgreSQL users are present
    state: present
    name: "{{ }}"
    password: "{{ item.password }}"
    role_attr_flags: "{{ item.roles }}"
  with_items: postgresql_users

Building Blocks - Template

We can think templates as our configuration files. Ansible lets us use the Jinja2 template engine for reforming and parameterising our files. 

The Jinja2 templating engine offers a wide range of control structures, functions and filters..

Some of useful capabilities of Jinja2:

  • for-loops
  • join(), default(), range(), format()
  • union(), intersect(), difference()
  • | to_json, | to_nice_yaml, | from_json, | from_yml
  • | min, | max, | unique, | version_compare, | random


Building Blocks - Template

*:*:*:{{ }}:{{ postgresql_replication_user.password }}

Let's check our pgpass.j2 template:

Let's have a look at pg_hba.conf.j2 file, too:

# Access to user from local without password, from subnet with password
{% for user in postgresql_users %}
local  all         {{ }}                             trust
host   all         {{ }}    {{ ec2_subnet }}         md5
{% endfor %}

# Replication user for standby servers access
{% for standby_ip in postgresql_standby_ips %}
host   replication {{ }}  {{ standby_ip }}/32  md5
{% endfor %}

Building Blocks - Playbook

  • Playbooks contains Plays
    • Plays contain Tasks
      • Tasks call Modules and may (optionally) trigger handlers (run once, run at the end)

Building Blocks - Playbook

  • Ansible playbooks are written using the YAML syntax.
  • Playbooks may contain more than one plays
  • Each play contains:
    • name of host groups to connect to
    • tasks it needs to perform.
  • A play may also contain variables/roles/handlers, if defined.


Strict dependency ordering: everything in file performs in a sequential order.

Building Blocks - Playbook

Let's look at our playbook example (main.yml):

- name: Ensure all virtual machines are ready
  connection: local
  vars_files: # load default variables from YAML files below
    - 'defaults/postgresql.yml'
    - 'defaults/aws.yml'
    - include: 'tasks/provision.yml' # load infrastructure setup tasks

- name: Ensure all required PostgreSQL dependencies ready
  hosts: postgresql-all # manage all PostgreSQL servers
  sudo: yes
  sudo_user: root
    - 'defaults/postgresql.yml'
    - 'defaults/aws.yml'
    - include: 'tasks/postgresql.yml' # load PostgreSQL setup tasks


Building Blocks - Role

You absolutely should be using roles. Roles are great. Use roles. Roles! Did we say that enough? Roles are great.


In Ansible,

  • playbooks organize tasks
  • roles organize playbooks


Imagine that we have lots of independent resources to manage (e.g., web servers, PostgreSQL servers, logging, monitoring, AWS, ...).


Putting everything in a single playbook may result in an unmaintainable solution.

Building Blocks - Role

To reduce such complexity, roles help us with:


  • Splitting tasks into much smaller playbooks
    This allows us to focus on resources, independently. That makes it simpler to maintain and debug. Also it will be much easier to understand the structure.

  • Reusing configs, files, templates, tasks
    This way we can easily share those components between playbooks, without rewriting over and over.

  • Handling playbook dependencies
    When we execute a role, we can be sure that all of preconditions
    are satisfied for that role.

Building Blocks - Role

Here you can see a dependency graph and the corresponding role directory structure:


How to Invoke Ansible?


To work with Ansible, we have 2 main alternatives;



  1. Running ad-hoc commands
  2. Running playbooks



Let's check them out one by one.

Ad-hoc Commands

We can call any Ansible module from the command line, anytime.

The ansible CLI tool works like a single task. It requires an inventory, a module name, and module parameters.


For example, given an inventory file like:




Now we can call any module.

Ad-hoc Commands

We can check uptimes of all hosts in dbservers using:


ansible dbservers -i hosts.ini -m command -a "uptime"


Here we can see the Ansible output:

gulcin@apathetic ~ # ansible dbservers -i hosts.ini -m command -a "uptime" | success | rc=0 >>
21:16:24 up 93 days,  9:17,  4 users,  load average: 0.08, 0.03, 0.05

How to Run Playbooks?

For more complex scenarios, we can create playbooks or roles and ask Ansible to run them.


When we run a playbook or a role, Ansible first gathers a lot of useful facts about remote systems it manages. These facts can later be used in playbooks, templates, config files, etc.


We can use the ansible-playbook CLI tool to run playbooks.

How to Run Playbooks?

Given an inventory file like this:




We may have a playbook that connects to hosts in dbservers group, executes the uptime command, and then displays that command's output.


Now let's create a simple playbook to see how it can be ran. 

How to Run Playbooks?

Here is the main.yml file for the playbook we just described:


- hosts: dbservers

  - name: retrieve the uptime
    command: uptime
    register: command_result # Store this command's result in this variable

  - name: Display the uptime
    debug: msg="{{ command_result.stdout }}" # Display command output here

How to Run Playbooks?

Now we can run the playbook and see it's output here:

gulcin@apathetic ~ $ ansible-playbook -i hosts.ini main.yml

PLAY [dbservers] **************************************************************

GATHERING FACTS ***************************************************************
ok: []

TASK: [retrieve the uptime] ***************************************************
changed: []

TASK: [Display the uptime] ****************************************************
ok: [] => {
    "msg": " 15:54:47 up 3 days, 14:32,  2 users,  load average: 0.00, 0.01, 0.05"

PLAY RECAP ********************************************************************             : ok=3    changed=1    unreachable=0    failed=0

Playbook control

  • --tags / --skip-tags
    Runs or skips tasks with specified tags
  • --limit
    Manages only specified hosts or groups
  • --start-at-task
    Start execution from a specific task
  • --step
    Executes step-by-step and asks for confirmation to continue
  • --check / --diff / --syntax-check
    Runs a playbook without actually executing anything 

Playbook loops 

N.B! register assigns result of a task to a variable.


Ansible supports iterating over facts with loop statements:

  • with_items: runs the task using the provided array variable
  • with_indexed_items: runs the task using the provided array variable and adds item index
  • with_flattened: runs the task using merged variables
  • with_file: runs the task using given file's contents


Some other with_type functions: 


- vars:
     - { name: gulcin, password: apathetic, roles: "SUPERUSER,LOGIN" }
     - { name: foo, password: foobarbaz, roles: "CREATEDB,LOGIN" }

- name: Ensure PostgreSQL users are present
    state: present
    name: "{{ }}"
    password: "{{ item.password }}"
    role_attr_flags: "{{ item.roles }}"
  with_items: postgresql_users
  • Executes the task for each element in given array
  • Current element value can be accessed using {{ item }}


- name: Ensure standby hosts have updated IP addresses
    name: "postgresql-standby-{{ item.0 + 1 }}" # e.g., postgresq-standby-1
    groups: "postgresql-standbys"
    ansible_ssh_host: "{{ item.1.tagged_instances[0].public_ip }}"
    ansible_ssh_user: "{{ ec2_ssh_user }}"
  with_indexed_items: ec2_standbys.results
  • Executes the task for each element in given array
  • Current element index can be accessed using {{ item.0 }}
  • Current element value can be accessed using {{ item.1 }}


- name: Wait for SSH to become ready
  wait_for: host="{{ item.tagged_instances[0].public_ip }}" port=22 timeout=320 state=started
    - [ ec2_master ] # ec2_master is a single item. 
                     # By putting [ and ], we convert it into an array
    - ec2_standbys.results
  • Merges given arrays into a single one
  • Executes the task for each element in merged array
  • Current element value can be accessed using {{ item }}


  • Executes the task for the given file
  • Contents of the file can be accessed using {{ item }}

- name: Ensure the SSH key is present
    state: present
    region: "eu-central-1"      # Amazon's Frankfurt Datacenter
    name: "postgresql-key"      # we will use this name later
    key_material: "{{ item }}"
  with_file: ""       # reads contents of this file to use in "key_material"

Advanced playbook features

For some advanced operations, we can use following in our tasks:


  • failed_when (condition) changed_when (condition):
    Lets tasks to know when they're going to fail or change based on the given condition. It might be stdout or stderr, or text to log.
  • until (condition):
    Repeats the task until the given condition is satisfied.
  • ignore_errors:
    Doesn't fail on errors
  • run_always:
    Runs even in dry-run / check / diff modes

Advanced playbook features

  • serial:
    Runs the task on a single host at a single time
  • pre_taskspost_tasks:
    Defines names of tasks that are executed before or after the current task. These referenced tasks are not either main tasks or roles, but they setup the environment to be ready for that.
  • delegate_to (host):
    This tells Ansible to execute the current task on the given host instead of the current host that is defined by the inventory.

    For example, if we want to start a virtual machine, we can't do that on that machine, because it is not running. We can simply delegate this task to localhost (

Postgres modules

My blog post covers Ansible PostgreSQL modules: Ansible Loves PostgreSQL

You can find the related Postgres modules' examples on my github repo.

  • postgresql_db: Creates/removes a given db.
  • postgresql_ext: Adds/removes extensions from a db.
  • postgresql_user: Adds/removes users and roles from a db.
  • postgresql_privs: Grants/revokes privileges on db objects (table, sequence, function, db, schema, language, tablespace, group).
  • postgresql_lang: Adds, removes or changes procedural languages with a db.

Creates/removes a given db.  In Ansible terminology, it ensures that a given db is present or absent. 

  • Required param: name
  • Connection params: login_hostport,login_userlogin_password
  • Important param: state (present or absent)

Create db module_test

Remove db module_test

Adds/removes extensions from a db. PostgreSQL offers a wide range of extensions which are extremely helpful. 

  • Mandatory params: db and name (for extension)
  • Other params: state and connection params (login_host, port) as in postgres_db 

Adds/removes users and roles from a db. 

Parameters: name (mandatory), state, connection params, db, password, priv, role_attr_flags ([NO]SUPERUSER, [NO]CREATEDB..)

Creates user

Alters role and gives login and createdb

Grants/revokes privileges on db objects (table, sequence, function, db, schema, language, tablespace, group).

  • Required params: databaseroles
  • Important opt. params: type (i.e. table,sequence,function), objs, privs (i.e. ALL; SELECT, UPDATE, INSERT) 


 Adds, removes or changes procedural languages with a db. 

One of the very powerful features of PostgreSQL is its support for virtually any language to be used as a procedural language.

Params: lang (mandatory), db, connection params, state, cascade, trust 


AWS modules

Ansible provides more than 50 modules for provisioning and managing Amazon Web Services resources.


We have modules for:

  • EC2 (Virtual machine instances)
  • AMI (Virtual machine images)
  • ELB (Load balancers)
  • VPC (Networking infrastructure)
  • EBS (Storage infrastructure)
  • ...

AWS modules

Ansible AWS modules are developed using Boto.

Boto is a Python package that provides interfaces to Amazon Web Services.


For more complex workflows, we can use the script to discover dynamic inventories for our playbooks:

Manages SSH key pairs that we use to connect to our instances.


  • state: present or absent
  • name: name of the SSH key
  • key_material: contents of the public SSH key
  • region: AWS region to deploy key to (Ireland, Frankfurt, etc)
  • wait: wait for the state 'available' before returning
- name: Ensure the SSH key is present
    state: present
    region: "eu-central-1"      # Amazon's Frankfurt Datacenter
    name: "postgresql-key"      # we will use this name later
    key_material: "{{ item }}"
  with_file: ""       # reads contents of this file to use in "key_material"

Manages virtual private clouds & networking.

  • state, region, wait: ... same as previous module ...
  • cidr_block: CIDR representing this cloud (e.g.,
  • internet_gateway: is internet access enabled?
  • resource_tags: A list of tags in form:
    { tag1: value1, tag2: value2, ... }
  • subnets: An array of subnets in form:
    [{ cidr: BLOCK, az: ZONE, resource_tags: [ TAGS ] }]
  • route_tables: An array of routes in form:
    [{ subnets: [ BLOCKS ], routes: [ ROUTES ], resource_tags[ TAGS ] }]

- name: Ensure VPC is present
    state: present           # Create a VPC
    region: "eu-central-1"   # Amazon's Frankfurt Datacenter
    cidr_block: ""
      - cidr: ""
      - subnets: [ "" ]
          - dest:  # Everywhere
            gw: igw          # Name of Amazon's default gateway
    internet_gateway: yes    # We want our instances to connect to internet
    wait: yes                # Wait until the VPC is ready
    resource_tags: { environment: "production", tier: "DB" }
  register: vpc              # Store task results in this variable to use in later tasks

Manages firewalls.

  • state, region, name: ... same as previous module ...
  • vpc_id: identifier of the VPC
  • description: Text for purpose of this security group
  • rules: An array of security rules in form:
    [{ proto: protocol, from_port: portto_portportcidr_ipblock }]

    protocol can be tcp, udp, icmp, any, ...
    port can be a port number or -1 (any port)
    block should be a CIDR block (e.g.,

- name: Ensure the PostgreSQL security group is present
    state: present
    vpc_id: "{{ vpc.vpc_id }}" # Obtain recently provisioned VPC's ID
    region: "eu-central-1"     # Amazon's Frankfurt Datacenter
    name: "PostgreSQL"
    description: "Security group for PostgreSQL database servers"
      - proto: tcp             # Allow SSH access from anywhere
        from_port: 22
        to_port: 22
      - proto: all             # Allow everything from machines in the subnet
        from_port: -1          # Any port
        to_port: -1            # Any port
        cidr_ip: ""
  register: security_group     # Use results of this task later


Creates, deletes, starts, stops, restarts virtual machine instances.

  • state, region, name, wait: ... same as previous modules ...
  • instance_tags: same as resource_tags
  • group_ididentifier of the security group
  • vpc_subnet_ididentifier of the VPC subnet
  • key_name: SSH key name
  • instance_type: type of the instance (e.g., c3.large, t2.micro, etc.)
  • private_ip: IP address in the subnet we provided
  • assign_public_ip: should the instance have a public IP address?
  • volumesan array of disk volumes in form:

    {[ device_name: NAME, volume_size: SIZE, delete_on_termination: FLAG ]}


- name: Ensure master EC2 instances & volumes are present
    assign_public_ip: yes # our machines should access internet
    instance_tags: { Name: "pg-master {{ postgresql_master_ip }}", environment: "production" }
    exact_count: 1
      Name: "pg-master {{ postgresql_master_ip }}"
    image: "ami-accff2b1"                     # Ubuntu Server 14.04
    instance_type: "t2.micro"
    group_id: "{{ security_group.group_id }}"
    key_name: "postgresql-key"
    private_ip: "{{ postgresql_master_ip }}"  # e.g.,
    region: "eu-central-1"                    # Amazon's Frankfurt Datacenter
      - device_name: /dev/sdc                                                                                                       
        volume_size: "50"                     # 50GB disk volume
        delete_on_termination: false          # Volume should remain after instance is deleted
    vpc_subnet_id: "{{ vpc.subnets[0].id }}"
    wait: yes                                 # Wait until the instance becomes running
  register: ec2_master                        # Register results here so we can use them later


Demo playbook structure

Playbook AWS architecture

First run

Verify setup

New standby server


Ansible loves PostgreSQL and Ansible has a very active community.


That's why more PostgreSQL modules would be helpful for everyone.

Making contributions to Ansible will be appreciated :)




Huge Thanks!

Managing PostgreSQL with Ansible

By Gülçin Yıldırım Jelínek

Managing PostgreSQL with Ansible

This presentation is prepared for PGConf Europe 2015 in Vienna.

  • 13,699