hiera 101
A basic introduction
Francisco Martínez
Puppet Camp Barcelona 2013
CODE AND DATA SEPARATION
When writing software,
it is useful to keep data and code separated
This allows for reusability
Code (Behaviour)
Data (Configuration)
Usually, in the different environments of a project, you have copies of you machines with very similar behavior
BUT
different data, as machines will refer to each other
environments
example: environments
In the example, the backend in development will have a connection string refering to db-dev
mongodb://db-dev:2500/?replicaSet=test
while the other backends point to their respective db servers
mongodb://db-prod:2500/?replicaSet=real
How to configure the backend is code.
The reference to the db server is data.
(To be more specific, environment-specific data)
there's more than one way to do it
Worthwhile read IMHO:
https://puppetlabs.com/blog/the-problem-with-separating-data-from-puppet-code/
Proposes different solutions to the problem
- Node inheritance
- Parametrized Classes
- External Node Classifier
- Extlookup
- Hiera
hiera to the rescue!
Hiera is a key/value lookup tool for configuration data
$dnsserver = hiera('dnsserver')
Allows to get data from different data sources and establish a hierarchy among them
:hierarchy:
- location
- global
Look into "
location
" and if I don't find my key, go search for it in "global
"
Dynamic data sources
Let's dive in the configuration file (
hiera.yaml
):---
:backends:
- yaml
:hierarchy:
- %{::clientcert}
- %{::environment}
- virtual_%{::is_virtual}
- common
:yaml:
:datadir: /etc/puppet/hieradata
In
:hierarchy:
we put data sources- static (
common
)
- dynamic (reference facts)
Order matters!
Data sources
development.yaml
db_connect_string: 'mongodb://db-dev:2500/?replicaSet=devel'
backend_base_url: 'http://backend-devel.example.com/apirest'
puppet_interval: '5'
production.yaml
db_connect_string: 'mongodb://db-prod:3500/?replicaSet=prod'
backend_base_url: 'http://backend-prod.example.com/apirest'
puppet_interval: '720'
nameserver.example.com.yaml
puppet_interval: '120'
virtual_true.yaml
puppet_interval: '60'
common.yaml
puppet_interval: '15'
dynamically look up parameters based on facts
Dynamic data sources are based on facts, so we
can make fact-dependent hiera files.
- Specify different tcp tunning values depending on the kernelversion fact
- Use different dns servers depending on the is_virtual fact
- Connecting to one server or another depending on the machine having selinux on
- And all these without using
if
orcase
clauses :)
Backends
A backend is how hiera implements the actual storage of the data
Ships with support for:
- JSON
- YAML
But there's around other backends
- redis
- mysql
- gpg
- puppet
implications on puppet code
Using hiera means that configuration (data that changes in the different instantiations of your class) can be managed either
- by parameters
- by hiera
So, instead of:
class web_server ($port = 80) {...
You might want to use
class web_server ( ) {
hiera('web_server::port')
}
Tricks
non scalar data
The value of a query can be another structure (hash of hashes, for example):
openstack::all:
admin_email: devnull@example.com
keystone_admin_token: G6943LMReKj_kqdAVrAiPbpRloAfE1fqp0eVAJ
rabbitmq:
user: rabbitmq
password: secret
Can then be queried via
$data = hiera_hash('openstack::all')
$rabbitmq_user = $data['rabbitmq']['user']
merging data
-
hiera()
: performs a simple lookup, stops searching after it finds a value
-
hiera_array()
: returns a flattened array with all the matching values
-
hiera_hash()
: returns one merged hash with all the values returned
include
hiera_include()
: looks up as in hiera_array()
and then includes each classname returned web01.example.com.yaml
---
classes:
- apache
- redis
- wordpress
common.yaml
---
classes:
- base::linux
site.pp
hiera_include(classes)
puppet 3
Include
Starting in Puppet 3, the
include
function will rely on external data for parametersWhen a class is declared, Puppet will try the following:
- Request a value from hiera, using the key
<class name::parameter name>
Example:apache::version
- Use the default value
- Fail compilation with an error if no value can be found
ENC-like behavior
You can use this feature in Puppet 3 and the
hiera_include()
function to set up an ENC-like behavior with hierasite.pp
node default {
hiera_include("classes")
}
datacenter-mad.yaml
---
classes:
- ntp
ntp::server: ntp-mad.example.com
datacenter-bcn.yaml
---
classes:
- ntp
ntp::server: ntp-bcn.example.com
q & A
BACKUP SLIDES
EXTERNAL NODE CLASSIFIER
Binding of nodes based on the certname:
node "devel-*" {
The ENC is a script that puppet can use to decide, dynamically and based on external data sources, which configuration each node gets.
So you can decide which configuration the node receives the manifests to get configured as a webserver on an external ldap server, for example.
YAML
YAML is a language to serialize data.
Similar to XML, but less verbose and more readable.
Allows to modelize structured data:
--- # Favorite movies, Array/List
- Matrix
- Sucker Punch
- Scott Pilgrim vs. the World
- American Psycho
--- # domains and countries, Hash/Dictionary
de: 'Germany'
sk: 'Slovakia'
hu: 'Hungary'
us: 'United States'
no: 'Norway'
environments
Puppet supports the concept of "environment".
Environments are supplied by the host or the ENC and decide which "root" of configuration to use.
So you can keep development code completely separated from staging or production code, and play with your development environment day-to-day, and decide when to put that code into production.
caveats
Stablishing a hiearchy for hiera is terribly complicated, and you could easily end up making a mess
Hiera as a source of external data is flexible, and has many advantages. Nonetheless, it's easily an overkill in many setups where you don't need that degree of flexibility: Think KISS
hiera 101
By franmrl
hiera 101
- 7,276