Hold my beer and watch this!
Upgrading OpenStack from Havana to Juno
Jesse Keating
- @iamjkeating
- Lover of complicated
problemschallenges - Upgrading OpenStack since Grizzly
- Developed an appreciation for dark heavy beer
- .... since Grizzly
OpenStack Upgrades
+5 Sword of Ansible
COW PARADE – BRAVEHEART
Image by Richard Cross
Upgrade Styles
- Micro upgrades
- Macro upgrades
Orchestration
- Eventual consistency need not apply
- Ordered set of actions to accomplish
But first,
Database upgrade!
Percona XtraDB Cluster(****)
- Two node cluster
- One arbiter
- Based on MySQL 5.5
- MySQL 5.5 can't handle Neutron migration
- Upgrade to 5.6
DB Hosts (2)
- Stop DB
- Remove packages
- Put updated config in
- Modify compat settings
- Turn off replication
- Install new packages
- Run upgrade migration
- Restore replication
- Restart DB (again)
- Repeat on other host
- Remove compat settings
- Restart DB (again again)
Arbiter
- Purge old package/config
- Fix filesystem perms
- Run role as if new
- name: upgrade percona cluster in compat mode
hosts: db
max_fail_percentage: 1
tags: dbupgrade
serial: 1
tasks:
- name: check db version
command: mysql -V
changed_when: False
register: mysqlver
- include: upgrade-db-cluster.yml
when: not mysqlver.stdout|search('Distrib 5\.6')
- name: upgrade percona arbiter
hosts: db_arbiter
max_fail_percentage: 1
tags: dbupgrade
pre_tasks:
- name: purge old garbd and configs
apt: name=percona-xtradb-cluster-garbd-2.x state=absent
purge=true
- name: remove old garb config
file: path=/etc/default/garb state=absent
- name: garbd.log permissions
file: path=/var/log/garbd.log owner=nobody state=touch
roles:
- role: percona-common
- role: percona-arbiter
- name: remove percona compat settings
hosts: db
max_fail_percentage: 1
tags: dbupgrade
serial: 1
tasks:
- name: remove compat settings
lineinfile: regexp="{{ item }}" state=absent
dest=/etc/mysql/conf.d/replication.cnf
with_items:
- '^wsrep_provider_options\s*='
- '^log_bin_use_v1_row_events\s*='
- '^gtid_mode\s*='
- '^binlog_checksum\s*='
- '^read_only\s*='
notify: restart mysql
handlers:
- name: restart mysql
service: name=mysql state=restarted
# stop all the dbs to prevent writes
- name: stop databases
service: name=mysql state=stopped
- name: remove old packages
apt: name={{ item }} state=absent
with_items:
- percona-xtradb-cluster-server-5.5
- percona-xtradb-cluster-galera-2.x
- percona-xtradb-cluster-common-5.5
- percona-xtradb-cluster-client-5.5
- name: configure my.cnf
template: src=roles/percona-server/templates/etc/my.cnf
dest=/etc/my.cnf mode=0644
when: ansible_distribution_version == "12.04"
notify:
- restart mysql server
- name: configure my.cnf
template: src=roles/percona-server/templates/etc/my.cnf
dest=/etc/mysql/my.cnf mode=0644
when: ansible_distribution_version != "12.04"
notify:
- restart mysql server
- name: install mysql config files
template: src=roles/percona-server/templates/etc/mysql/conf.d/{{ item }}
dest=/etc/mysql/conf.d/{{ item }}
mode=0644
with_items:
- bind-inaddr-any.cnf
- tuning.cnf
- utf8.cnf
- name: adjust replication for compatability and new features
lineinfile: regexp="{{ item.value.regexp }}"
line="{{ item.value.line }}"
dest=/etc/mysql/conf.d/replication.cnf state=present
with_dict:
provider:
regexp: '^wsrep_provider\s*='
line: "wsrep_provider = none"
provider_options:
regexp: '^wsrep_provider_options\s*='
line: 'wsrep_provider_options="socket.checksum=1"'
log_bin_v1:
regexp: '^log_bin_use_v1_row_events\s*='
line: 'log_bin_use_v1_row_events=1'
gtid:
regexp: '^gtid_mode\s*='
line: 'gtid_mode=0'
binlog:
regexp: '^binlog_checksum\s*='
line: 'binlog_checksum=None'
wsrep_method:
regexp: '^wsrep_sst_method\s*='
line: "wsrep_sst_method = xtrabackup-v2"
read_only:
regexp: '^read_only\s*='
line: "read_only = ON"
- name: install new packages
apt: name=percona-xtradb-cluster-56
- name: run mysql_upgrade
command: mysql_upgrade
- name: restore galera wsrep provider
lineinfile: regexp='^wsrep_provider\s*='
line="wsrep_provider = /usr/lib/libgalera_smm.so"
dest=/etc/mysql/conf.d/replication.cnf
- name: restart mysql to rejoin the cluster
service: name=mysql state=restarted
Upgrade Rabbit?
On to OpenStack!
Repeating Pattern
- New code + config
- Stop old code
- Migrate database
- Start new code
Order
- Matters if you care
- Seems to work in our order
- Minimizes disruption
- Avoid inter-project version deps
Our Order
- glance
- cinder
- nova
- neutron
- swift
- keystone
- horizon
Shortcuts
- Neutron
- ml2
- Linux bridge
- Newer kernel
Strategy
- Reuse deployment code
- Delay restarts
- Fail immediately
- Non-destructive re-runs
Glance
No surprises
- name: upgrade glance
hosts: controller
max_fail_percentage: 1
tags: glance
roles:
- role: glance
force_sync: true
restart: False
database_create:
changed: false
- name: glance config
template: src={{ item }} dest=/etc/glance mode=0644
with_fileglob: ../templates/etc/glance/*
notify:
- restart glance services
- name: stop glance services before db sync
service: name={{ item }} state=stopped
with_items:
- glance-api
- glance-registry
when: database_create.changed or force_sync|default('false')|bool
- name: sync glance database
command: glance-manage db_sync
when: database_create.changed or force_sync|default('false')|bool
run_once: true
changed_when: true
notify:
- restart glance services
# we want this to always be changed so that it can notify the service restart
- meta: flush_handlers
- name: start glance services
service: name={{ item }} state=started
with_items:
- glance-api
- glance-registry
Cinder
More complicated
# Cinder block
- name: stage cinder data software
hosts: cinder_volume
max_fail_percentage: 1
tags:
- cinder
- cinder-volume
roles:
- role: cinder-data
restart: False
- role: stop-services
services:
- cinder-volume
- name: stage cinder control software and stop services
hosts: controller
max_fail_percentage: 1
tags:
- cinder
- cinder-control
roles:
- role: cinder-control
force_sync: true
restart: False
database_create:
changed: false
- name: start cinder data services
hosts: cinder_volume
max_fail_percentage: 1
tags:
- cinder
- cinder-volume
tasks:
- name: start cinder data services
service: name=cinder-volume state=started
- name: ensure cinder v2 endpoint
hosts: controller[0]
max_fail_percentage: 1
tags:
- cinder
- cinder-endpoint
tasks:
- name: cinder v2 endpoint
keystone_service: name={{ item.name }}
type={{ item.type }}
description='{{ item.description }}'
public_url={{ item.public_url }}
internal_url={{ item.internal_url }}
admin_url={{ item.admin_url }}
region=RegionOne
auth_url={{ endpoints.auth_uri }}
tenant_name=admin
login_user=provider_admin
login_password={{ secrets.provider_admin_password }}
with_items: keystone.services
when: endpoints[item.name] is defined and endpoints[item.name]
and item.name == 'cinderv2'
- name: stop services
service: name={{ item }} state=stopped
with_items: services
Nova
Pretty straight forward
# Nova block
- name: stage nova compute
hosts: compute
max_fail_percentage: 1
tags:
- nova
- nova-data
roles:
- role: nova-data
restart: False
when: ironic.enabled == False
- role: stop-services
services:
- nova-compute
when: ironic.enabled == False
- name: stage nova control and stop services
hosts: controller
max_fail_percentage: 1
tags:
- nova
- nova-control
roles:
- role: nova-control
force_sync: true
restart: False
database_create:
changed: false
- name: start nova compute
hosts: compute
max_fail_percentage: 1
tags:
- nova
- nova-data
tasks:
- name: start nova compute
service: name=nova-compute state=started
when: ironic.enabled == False
Neutron
DB Stamp
# Neutron block
- name: stage neutron core data
hosts: compute:network
max_fail_percentage: 1
tags:
- neutron
- neutron-data
roles:
- role: neutron-data
restart: False
- name: stage neutron network
hosts: network
max_fail_percentage: 1
tags:
- neutron
- neutron-network
roles:
- role: neutron-data-network
restart: False
- name: upgrade neutron control plane
hosts: controller
max_fail_percentage: 1
tags:
- neutron
- neutron-control
pre_tasks:
- name: check db version
command: neutron-db-manage --config-file /etc/neutron/neutron.conf
--config-file /etc/neutron/plugins/ml2/ml2_plugin.ini
current
register: neutron_db_ver
run_once: True
- name: stamp neutron to havana
command: neutron-db-manage --config-file /etc/neutron/neutron.conf
--config-file /etc/neutron/plugins/ml2/ml2_plugin.ini
stamp havana
when: not neutron_db_ver.stdout|search('juno')
run_once: True
roles:
- role: neutron-control
force_sync: true
restart: False
database_created:
changed: false
- name: restart neutron data service
hosts: compute:network
max_fail_percentage: 1
tags:
- neutron
- neutron-data
tasks:
- name: restart neutron data service
service: name=neutron-linuxbridge-agent state=restarted
- name: restart neutron data network services
hosts: network
max_fail_percentage: 1
tags:
- neutron
- neutron-network
tasks:
- name: restart neutron data network agent services
service: name={{ item }} state=restarted
with_items:
- neutron-l3-agent
- neutron-dhcp-agent
- neutron-metadata-agent
Swift
Even easier!
- name: upgrade swift
hosts: swiftnode
any_errors_fatal: true
tags: swift
roles:
- role: haproxy
haproxy_type: swift
tags: ['openstack', 'swift', 'control']
- role: swift-object
tags: ['openstack', 'swift', 'data']
- role: swift-account
tags: ['openstack', 'swift', 'data']
- role: swift-container
tags: ['openstack', 'swift', 'data']
- role: swift-proxy
tags: ['openstack', 'swift', 'control']
Keystone
No sweat
- name: upgrade keystone
hosts: controller
max_fail_percentage: 1
tags: keystone
roles:
- role: keystone
force_sync: true
restart: False
database_create:
changed: False
Horizon
It's just a webapp!
- name: upgrade horizon
hosts: controller
max_fail_percentage: 1
tags: horizon
roles:
- role: horizon
Gotchas
Keystone PKI tokens
- Not actually faster
- Break services until restart
Neutron / Nova vif_plugging_is_fatal
- Version dependency
- Breaks builds until both upgraded
Bloated Nova
Deleted Instances
- Data still there
- Migrations longer
- No supported tool to trim
Resources
- Ursula https://github.com/blueboxgroup/ursula/
- Twitter @iamjkeating
- IRC #openstack-operators
Questions?
Blue Box Booth #T5!
Come by our booth, get bling, see our schedule of sessions, chat with awesome people!
Hold my beer and watch this
By Jesse Keating
Hold my beer and watch this
Upgrading OpenStack from Havana to Juno
- 2,175