hiera 101


A basic introduction





Francisco Martínez
Puppet Camp Barcelona 2013




CODE AND DATA SEPARATION





When writing software, 
it is useful to keep data and code separated

This allows for reusability
Code (Behaviour) 
Data (Configuration)

Usually, in the different environments of a project, you have copies of you machines with very similar behavior

BUT

different data, as machines will refer to each other

environments


example: environments


In the example, the backend in development will have a connection string refering to db-dev

mongodb://db-dev:2500/?replicaSet=test

while the other backends point to their respective db servers

mongodb://db-prod:2500/?replicaSet=real

How to configure the backend is code.

The reference to the db server is data.

(To be more specific, environment-specific data)




there's more than one way to do it


Worthwhile read IMHO:

https://puppetlabs.com/blog/the-problem-with-separating-data-from-puppet-code/

Proposes different solutions to the problem

  • Node inheritance
  • Parametrized Classes
  • External Node Classifier
  • Extlookup
  • Hiera




hiera to the rescue!


Hiera is a key/value lookup tool for configuration data
$dnsserver = hiera('dnsserver')

Allows to get data from different data sources and establish a hierarchy among them
:hierarchy:
    - location
    - global
Look into "location" and if I don't find my key, go search for it in "global"

Dynamic data sources


Let's dive in the configuration file (hiera.yaml):
---
:backends: 
   - yaml
:hierarchy: 
   - %{::clientcert}
   - %{::environment}
   - virtual_%{::is_virtual}
   - common
:yaml:
   :datadir: /etc/puppet/hieradata
In :hierarchy: we put data sources
  • static (common)
  • dynamic (reference facts)
Order matters!

Data sources

development.yaml
db_connect_string: 'mongodb://db-dev:2500/?replicaSet=devel'
backend_base_url: 'http://backend-devel.example.com/apirest'
puppet_interval: '5'
production.yaml
db_connect_string: 'mongodb://db-prod:3500/?replicaSet=prod'
backend_base_url: 'http://backend-prod.example.com/apirest'
puppet_interval: '720'
nameserver.example.com.yaml
puppet_interval: '120'
virtual_true.yaml
puppet_interval: '60'
common.yaml
puppet_interval: '15'

dynamically look up parameters based on facts

Dynamic data sources are based on facts, so we 
can make fact-dependent hiera files.

  • Specify different tcp tunning values depending on the kernelversion fact
  • Use different dns servers depending on the is_virtual fact
  • Connecting to one server or another depending on the machine having selinux on
  • And all these without using if or case clauses :)

Backends

A backend is how hiera implements the actual storage of the data

Ships with support for:
  • JSON
  • YAML

But there's around other backends
  • redis
  • mysql
  • gpg
  • puppet






implications on puppet code



Using hiera means that configuration (data that changes in the different instantiations of your class) can be managed either
  • by parameters
  • by hiera

So, instead of:
class web_server ($port = 80) {...
You might want to use
class web_server ( ) {
   hiera('web_server::port')
}




Tricks


non scalar data

The value of a query can be another structure (hash of hashes, for example):
openstack::all: 
   admin_email: devnull@example.com
   keystone_admin_token: G6943LMReKj_kqdAVrAiPbpRloAfE1fqp0eVAJ
   rabbitmq: 
      user: rabbitmq
      password: secret   
Can then be queried via 
$data = hiera_hash('openstack::all') 
$rabbitmq_user = $data['rabbitmq']['user']

merging data


  • hiera(): performs a simple lookup, stops searching after it finds a value
  • hiera_array(): returns a flattened array with all the matching values
  • hiera_hash(): returns one merged hash with all the values returned

include

hiera_include(): looks up as in hiera_array() and then includes each classname returned 
web01.example.com.yaml
---
classes:
   - apache
   - redis
   - wordpress
common.yaml
---
classes:
   - base::linux
site.pp
hiera_include(classes)




puppet 3

Include

Starting in Puppet 3, the include function will rely on external data for parameters

When a class is declared, Puppet will try the following:
  • Request a value from hiera, using the key <class name::parameter name>
    Example: apache::version
  • Use the default value
  • Fail compilation with an error if no value can be found

ENC-like behavior

You can use this feature in Puppet 3 and the hiera_include() function to set up an ENC-like behavior with hiera
site.pp
node default {
   hiera_include("classes")
}
datacenter-mad.yaml
---
classes:
   - ntp
ntp::server: ntp-mad.example.com
datacenter-bcn.yaml
---
classes:
   - ntp
ntp::server: ntp-bcn.example.com





q & A




BACKUP SLIDES

EXTERNAL NODE CLASSIFIER

Binding of nodes based on the certname:
node "devel-*" { 
The ENC is a script that puppet can use to decide, dynamically and based on external data sources, which configuration each node gets.

So you can decide which configuration the node receives the manifests to get configured as a webserver on an external ldap server, for example.

YAML

YAML is a language to serialize data.
Similar to XML, but less verbose and more readable.
Allows to modelize structured data:
--- # Favorite movies, Array/List
- Matrix
- Sucker Punch
- Scott Pilgrim vs. the World
- American Psycho
--- # domains and countries, Hash/Dictionary
de: 'Germany'
sk: 'Slovakia'
hu: 'Hungary'
us: 'United States'
no: 'Norway'

environments

Puppet supports the concept of "environment".
Environments are supplied by the host or the ENC and decide which "root" of configuration to use.

So you can keep development code completely separated from staging or production code, and play with your development environment day-to-day, and decide when to put that code into production.

caveats


Stablishing a hiearchy for hiera is terribly complicated, and you could easily end up making a mess

Hiera as a source of external data is flexible, and has many advantages. Nonetheless, it's easily an overkill in many setups where you don't need that degree of flexibility: Think KISS

hiera 101

By franmrl