ceph-ansible/rolling_update.yml

206 lines
4.8 KiB
YAML
Raw Normal View History

---
# This playbook does a rolling update for all the Ceph services
# Change the value of serial: to adjust the number of server to be updated.
#
# The four roles that apply to the ceph hosts will be applied: ceph-common,
# ceph-mon, ceph-osd and ceph-mds. So any changes to configuration, package updates, etc,
# will be applied as part of the rolling update process.
#
# /!\ DO NOT FORGET TO CHANGE THE RELEASE VERSION FIRST! /!\
- hosts: all
tasks:
- debug: msg="gather facts on all hosts for following reference"
- hosts: mons
serial: 1
sudo: True
vars:
upgrade_ceph_packages: True
pre_tasks:
- name: Compress the store as much as possible
command: ceph tell mon.{{ ansible_hostname }} compact
roles:
- ceph-common
- ceph-mon
post_tasks:
- name: Check if sysvinit
stat: >
This update has the following changes for the rolling update playbook. 1. Change how sysvinit ceph is determined to be enabled For mons, the playbook checks if sysvinit is enabled by trying to stat /var/lib/ceph/mon/ceph-{{ ansible_hostname }}/sysvinit [1]. However, that file is not created when a monitor is configured to use sysvinit, instead, Ansible's service module is used [2]. Ansible 2.0 can verify if a service is enabled and does this by checking for a glob, in this context it would be '/etc/rc?.d/S??ceph' [3]. Because Ansible 1.9 does not support this feature, this change updates rolling_update.yml by checking if sysvinit is enabled by having stat glob for the same pattern and following the symlink. This is done only for the mons. 2. Change how sysvinit ceph is restarted The playbook passes the argument "mon" to the sysv init script but, the init script does not necessarily take that argument and it failed when tested on a RHEL7 system. However, dropping the argument and just using Ansible's service module for state=restarted worked so this change does not have this line for when monsysvinit.stat.exists. A similar change is in this pull request for the OSD restart (removing args=osd). A second ceph mon restart command is run regardless of any conditions being met. I am not sure why the service is restarted in the case of upstart or sysvinit and then restarted again. I am going to assume there is a subtle reason for this and not touch this second run but I added a condition so that when ansible_os_family is Red Hat, then mon is not passed as an argument, otherwise it is restarted as was already in place. [1] https://github.com/ceph/ceph-ansible/blob/v1.0.3/rolling_update.yml#L32-L33 [2] https://github.com/ceph/ceph-ansible/blob/v1.0.3/roles/ceph-mon/tasks/start_monitor.yml#L42-L45 [3] https://github.com/ansible/ansible-modules-core/blob/stable-2.0/system/service.py#L492-L493
2016-04-08 11:55:15 +08:00
path=/etc/rc?.d/S??ceph
follow=yes
register: monsysvinit
- name: Check if upstart
stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_hostname }}/upstart
register: monupstart
- name: Restart the monitor after compaction (Upstart)
service: >
name=ceph-mon
state=restarted
args=id={{ ansible_hostname }}
when: monupstart.stat.exists == True
- name: Restart the monitor after compaction (Sysvinit)
service: >
name=ceph
state=restarted
when: monsysvinit.stat.exists == True
- name: restart monitor(s)
service: >
name=ceph
state=restarted
args=mon
This update has the following changes for the rolling update playbook. 1. Change how sysvinit ceph is determined to be enabled For mons, the playbook checks if sysvinit is enabled by trying to stat /var/lib/ceph/mon/ceph-{{ ansible_hostname }}/sysvinit [1]. However, that file is not created when a monitor is configured to use sysvinit, instead, Ansible's service module is used [2]. Ansible 2.0 can verify if a service is enabled and does this by checking for a glob, in this context it would be '/etc/rc?.d/S??ceph' [3]. Because Ansible 1.9 does not support this feature, this change updates rolling_update.yml by checking if sysvinit is enabled by having stat glob for the same pattern and following the symlink. This is done only for the mons. 2. Change how sysvinit ceph is restarted The playbook passes the argument "mon" to the sysv init script but, the init script does not necessarily take that argument and it failed when tested on a RHEL7 system. However, dropping the argument and just using Ansible's service module for state=restarted worked so this change does not have this line for when monsysvinit.stat.exists. A similar change is in this pull request for the OSD restart (removing args=osd). A second ceph mon restart command is run regardless of any conditions being met. I am not sure why the service is restarted in the case of upstart or sysvinit and then restarted again. I am going to assume there is a subtle reason for this and not touch this second run but I added a condition so that when ansible_os_family is Red Hat, then mon is not passed as an argument, otherwise it is restarted as was already in place. [1] https://github.com/ceph/ceph-ansible/blob/v1.0.3/rolling_update.yml#L32-L33 [2] https://github.com/ceph/ceph-ansible/blob/v1.0.3/roles/ceph-mon/tasks/start_monitor.yml#L42-L45 [3] https://github.com/ansible/ansible-modules-core/blob/stable-2.0/system/service.py#L492-L493
2016-04-08 11:55:15 +08:00
when: not ansible_os_family == "RedHat"
- name: restart monitor(s)
service: >
name=ceph
state=restarted
when: ansible_os_family == "RedHat"
- name: select a running monitor
set_fact: mon_host={{ item }}
with_items: groups.mons
when: item != inventory_hostname
- name: Waiting for the monitor to join the quorum...
shell: >
ceph -s | grep monmap | sed 's/.*quorum//' | egrep -sq {{ ansible_hostname }}
register: result
until: result.rc == 0
retries: 5
delay: 10
delegate_to: "{{ mon_host }}"
- hosts: osds
serial: 1
sudo: True
vars:
upgrade_ceph_packages: True
pre_tasks:
- name: Set OSD flags
command: ceph osd set {{ item }}
with_items:
- noout
- noscrub
- nodeep-scrub
delegate_to: "{{ groups.mons[0] }}"
roles:
- ceph-common
- ceph-osd
post_tasks:
- name: Check if sysvinit
shell: stat /var/lib/ceph/osd/ceph-*/sysvinit
register: osdsysvinit
failed_when: false
- name: Check if upstart
shell: stat /var/lib/ceph/osd/ceph-*/upstart
register: osdupstart
failed_when: false
- name: Gracefully stop the OSDs (Upstart)
service: >
name=ceph-osd-all
state=restarted
when: osdupstart.rc == 0
- name: Gracefully stop the OSDs (Sysvinit)
service: >
name=ceph
state=restarted
when: osdsysvinit.rc == 0
- name: Waiting for clean PGs...
shell: >
test "$(ceph pg stat | sed 's/^.*pgs://;s/active+clean.*//;s/ //')" -eq "$(ceph pg stat | sed 's/pgs.*//;s/^.*://;s/ //')" && ceph health | egrep -sq "HEALTH_OK|HEALTH_WARN"
register: result
until: result.rc == 0
retries: 10
delay: 10
delegate_to: "{{ groups.mons[0] }}"
- name: Unset OSD flags
command: ceph osd unset {{ item }}
with_items:
- noout
- noscrub
- nodeep-scrub
delegate_to: "{{ groups.mons[0] }}"
- hosts: mdss
serial: 1
sudo: True
vars:
upgrade_ceph_packages: True
roles:
- ceph-common
- ceph-mds
post_tasks:
- name: Check if sysvinit
stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_hostname }}/sysvinit
register: mdssysvinit
- name: Check if upstart
stat: >
path=/var/lib/ceph/mon/ceph-{{ ansible_hostname }}/upstart
register: mdsupstart
- name: Restart the metadata server (Upstart)
service: >
name=ceph-mds
state=restarted
args=id={{ ansible_hostname }}
when: mdsupstart.stat.exists == True
- name: Restart the metadata server (Sysvinit)
service: >
name=ceph
state=restarted
args=mds
when: mdssysvinit.stat.exists == True
- hosts: rgws
serial: 1
sudo: True
vars:
upgrade_ceph_packages: True
roles:
- ceph-common
- ceph-rgw
post_tasks:
- name: restart rados gateway server(s)
service: >
name={{ item }}
state=restarted
with_items:
- radosgw
when: radosgw_frontend == 'civetweb'
- name: restart rados gateway server(s)
service: >
name={{ item }}
state=restarted
with_items:
- apache2
- radosgw
when: radosgw_frontend == 'apache'