ceph-ansible/infrastructure-playbooks/shrink-mon.yml

---
# This playbook shrinks the Ceph monitors from your cluster
# It can remove a Ceph of monitor from the cluster and ALL ITS DATA
#
# Use it like this:
# ansible-playbook shrink-mon.yml -e mon_to_kill=ceph-mon01
#     Prompts for confirmation to shrink, defaults to no and
#     doesn't shrink the cluster. yes shrinks the cluster.
#
# ansible-playbook -e ireallymeanit=yes|no shrink-mon.yml
#     Overrides the prompt using -e option. Can be used in
#     automation scripts to avoid interactive prompt.


- name: gather facts and check the init system

  hosts:
    - "{{ mon_group_name|default('mons') }}"

  become: true

  tasks:
    - debug: msg="gather facts on all Ceph hosts for following reference"

- name: confirm whether user really meant to remove monitor from the ceph cluster

  hosts:
    - localhost

  become: true

  vars_prompt:
    - name: ireallymeanit
      prompt: Are you sure you want to shrink the cluster?
      default: 'no'
      private: no

  vars:
    mon_group_name: mons

  pre_tasks:
    - name: exit playbook, if only one monitor is present in cluster
      fail:
        msg: "You are about to shrink the only monitor present in the cluster.
              If you really want to do that, please use the purge-cluster playbook."
      when:
        - groups[mon_group_name] | length | int == 1

    - name: exit playbook, if no monitor was given
      fail:
        msg: "mon_to_kill must be declared
          Exiting shrink-cluster playbook, no monitor was removed.
           On the command line when invoking the playbook, you can use
           -e mon_to_kill=ceph-mon01 argument. You can only remove a single monitor each time the playbook runs."
      when:
        - mon_to_kill is not defined

    - name: exit playbook, if the monitor is not part of the inventory
      fail:
        msg: "It seems that the host given is not part of your inventory, please make sure it is."
      when:
        - mon_to_kill not in groups[mon_group_name]

    - name: exit playbook, if user did not mean to shrink cluster
      fail:
        msg: "Exiting shrink-mon playbook, no monitor was removed.
           To shrink the cluster, either say 'yes' on the prompt or
           or use `-e ireallymeanit=yes` on the command line when
           invoking the playbook"
      when:
        - ireallymeanit != 'yes'

  roles:
    - ceph-defaults
    - ceph-facts

  post_tasks:
    - name: pick a monitor different than the one we want to remove
      set_fact:
        mon_host: "{{ item }}"
      with_items: "{{ groups[mon_group_name] }}"
      when:
        - item != mon_to_kill

    - name: set_fact docker_exec_cmd build docker exec command (containerized)
      set_fact:
        docker_exec_cmd: "docker exec ceph-mon-{{ hostvars[mon_host]['ansible_hostname'] }}"
      when: containerized_deployment

    - name: exit playbook, if can not connect to the cluster
      command: "{{ docker_exec_cmd }} timeout 5 ceph --cluster {{ cluster }} health"
      register: ceph_health
      until: ceph_health.stdout.find("HEALTH") > -1
      delegate_to: "{{ mon_host }}"
      retries: 5
      delay: 2

    - name: set_fact mon_to_kill_hostname
      set_fact:
        mon_to_kill_hostname: "{{ hostvars[mon_to_kill]['ansible_hostname'] }}"

    - name: stop monitor service(s)
      service:
        name: ceph-mon@{{ mon_to_kill_hostname }}
        state: stopped
        enabled: no
      delegate_to: "{{ mon_to_kill }}"
      failed_when: false

    - name: purge monitor store
      file:
        path: /var/lib/ceph/mon/{{ cluster }}-{{ mon_to_kill_hostname }}
        state: absent
      delegate_to: "{{ mon_to_kill }}"

    - name: remove monitor from the quorum
      command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} mon remove {{ mon_to_kill_hostname }}"
      failed_when: false
      delegate_to: "{{ mon_host }}"

    # NOTE (leseb): sorry for the 'sleep' command
    # but it will take a couple of seconds for other monitors
    # to notice that one member has left.
    # 'sleep 5' is not that bad and should be sufficient
    - name: verify the monitor is out of the cluster
      shell: |
        {{ docker_exec_cmd }} ceph --cluster {{ cluster }} -s -f json | python -c 'import sys, json; print(json.load(sys.stdin)["quorum_names"])'
      delegate_to: "{{ mon_host }}"
      failed_when: false
      register: result
      until: mon_to_kill_hostname not in result.stdout
      retries: 2
      delay: 10

    - name: please remove the monitor from your ceph configuration file
      debug:
          msg: "The monitor has been successfully removed from the cluster.
          Please remove the monitor entry from the rest of your ceph configuration files, cluster wide."
      run_once: true
      when:
        - mon_to_kill_hostname not in result.stdout

    - name: fail if monitor is still part of the cluster
      fail:
          msg: "Monitor appears to still be part of the cluster, please check what happened."
      run_once: true
      when:
        - mon_to_kill_hostname in result.stdout

    - name: show ceph health
      command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} -s"
      delegate_to: "{{ mon_host }}"

    - name: show ceph mon status
      command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} mon stat"
      delegate_to: "{{ mon_host }}"
add shrink playbooks: mons and osds We now have the ability to shrink a ceph cluster with the help of 2 new playbooks. Even if a lot portions of those are identical I thought I would make more sense to separate both for several reasons: * it is rare to remove mon(s) and osd(s) * this remains a tricky process so to avoid any overlap we keep things * separated For monitors, just select the list of the monitor hostnames you want to delete from the cluster and execute the playbook like this. The hostname must be resolvable. Then run the playbook like this: ansible-playbook shrink-cluster.yml -e mon_host=ceph-mon-01,ceph-mon-02 Are you sure you want to shrink the cluster? [no]: yes For OSDs, just select the list of the OSD id you want to delete from the cluster and execute the playbook like this: ansible-playbook shrink-cluster.yml -e osd_ids=0,2,4 Are you sure you want to shrink the cluster? [no]: yes If you know what you're doing you can run it like this: ansible-playbook shrink-cluster.yml -e ireallymeanit=yes -e osd_ids=0,2,4 Thanks a lot to @SamYaple for his help on the complex variables/fact/filters Signed-off-by: Sébastien Han <seb@redhat.com> 2016-08-11 23:20:07 +08:00			`---`
			`# This playbook shrinks the Ceph monitors from your cluster`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`# It can remove a Ceph of monitor from the cluster and ALL ITS DATA`
add shrink playbooks: mons and osds We now have the ability to shrink a ceph cluster with the help of 2 new playbooks. Even if a lot portions of those are identical I thought I would make more sense to separate both for several reasons: * it is rare to remove mon(s) and osd(s) * this remains a tricky process so to avoid any overlap we keep things * separated For monitors, just select the list of the monitor hostnames you want to delete from the cluster and execute the playbook like this. The hostname must be resolvable. Then run the playbook like this: ansible-playbook shrink-cluster.yml -e mon_host=ceph-mon-01,ceph-mon-02 Are you sure you want to shrink the cluster? [no]: yes For OSDs, just select the list of the OSD id you want to delete from the cluster and execute the playbook like this: ansible-playbook shrink-cluster.yml -e osd_ids=0,2,4 Are you sure you want to shrink the cluster? [no]: yes If you know what you're doing you can run it like this: ansible-playbook shrink-cluster.yml -e ireallymeanit=yes -e osd_ids=0,2,4 Thanks a lot to @SamYaple for his help on the complex variables/fact/filters Signed-off-by: Sébastien Han <seb@redhat.com> 2016-08-11 23:20:07 +08:00			`#`
			`# Use it like this:`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`# ansible-playbook shrink-mon.yml -e mon_to_kill=ceph-mon01`
add shrink playbooks: mons and osds We now have the ability to shrink a ceph cluster with the help of 2 new playbooks. Even if a lot portions of those are identical I thought I would make more sense to separate both for several reasons: * it is rare to remove mon(s) and osd(s) * this remains a tricky process so to avoid any overlap we keep things * separated For monitors, just select the list of the monitor hostnames you want to delete from the cluster and execute the playbook like this. The hostname must be resolvable. Then run the playbook like this: ansible-playbook shrink-cluster.yml -e mon_host=ceph-mon-01,ceph-mon-02 Are you sure you want to shrink the cluster? [no]: yes For OSDs, just select the list of the OSD id you want to delete from the cluster and execute the playbook like this: ansible-playbook shrink-cluster.yml -e osd_ids=0,2,4 Are you sure you want to shrink the cluster? [no]: yes If you know what you're doing you can run it like this: ansible-playbook shrink-cluster.yml -e ireallymeanit=yes -e osd_ids=0,2,4 Thanks a lot to @SamYaple for his help on the complex variables/fact/filters Signed-off-by: Sébastien Han <seb@redhat.com> 2016-08-11 23:20:07 +08:00			`# Prompts for confirmation to shrink, defaults to no and`
			`# doesn't shrink the cluster. yes shrinks the cluster.`
			`#`
shrink-mon: fix typo in the code doc Signed-off-by: Sébastien Han <seb@redhat.com> 2017-10-27 17:59:22 +08:00			`# ansible-playbook -e ireallymeanit=yes\|no shrink-mon.yml`
add shrink playbooks: mons and osds We now have the ability to shrink a ceph cluster with the help of 2 new playbooks. Even if a lot portions of those are identical I thought I would make more sense to separate both for several reasons: * it is rare to remove mon(s) and osd(s) * this remains a tricky process so to avoid any overlap we keep things * separated For monitors, just select the list of the monitor hostnames you want to delete from the cluster and execute the playbook like this. The hostname must be resolvable. Then run the playbook like this: ansible-playbook shrink-cluster.yml -e mon_host=ceph-mon-01,ceph-mon-02 Are you sure you want to shrink the cluster? [no]: yes For OSDs, just select the list of the OSD id you want to delete from the cluster and execute the playbook like this: ansible-playbook shrink-cluster.yml -e osd_ids=0,2,4 Are you sure you want to shrink the cluster? [no]: yes If you know what you're doing you can run it like this: ansible-playbook shrink-cluster.yml -e ireallymeanit=yes -e osd_ids=0,2,4 Thanks a lot to @SamYaple for his help on the complex variables/fact/filters Signed-off-by: Sébastien Han <seb@redhat.com> 2016-08-11 23:20:07 +08:00			`# Overrides the prompt using -e option. Can be used in`
			`# automation scripts to avoid interactive prompt.`


shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`- name: gather facts and check the init system`

			`hosts:`
			`- "{{ mon_group_name\|default('mons') }}"`

			`become: true`

			`tasks:`
			`- debug: msg="gather facts on all Ceph hosts for following reference"`

			`- name: confirm whether user really meant to remove monitor from the ceph cluster`
add shrink playbooks: mons and osds We now have the ability to shrink a ceph cluster with the help of 2 new playbooks. Even if a lot portions of those are identical I thought I would make more sense to separate both for several reasons: * it is rare to remove mon(s) and osd(s) * this remains a tricky process so to avoid any overlap we keep things * separated For monitors, just select the list of the monitor hostnames you want to delete from the cluster and execute the playbook like this. The hostname must be resolvable. Then run the playbook like this: ansible-playbook shrink-cluster.yml -e mon_host=ceph-mon-01,ceph-mon-02 Are you sure you want to shrink the cluster? [no]: yes For OSDs, just select the list of the OSD id you want to delete from the cluster and execute the playbook like this: ansible-playbook shrink-cluster.yml -e osd_ids=0,2,4 Are you sure you want to shrink the cluster? [no]: yes If you know what you're doing you can run it like this: ansible-playbook shrink-cluster.yml -e ireallymeanit=yes -e osd_ids=0,2,4 Thanks a lot to @SamYaple for his help on the complex variables/fact/filters Signed-off-by: Sébastien Han <seb@redhat.com> 2016-08-11 23:20:07 +08:00
			`hosts:`
			`- localhost`

			`become: true`

			`vars_prompt:`
			`- name: ireallymeanit`
			`prompt: Are you sure you want to shrink the cluster?`
			`default: 'no'`
			`private: no`

shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`vars:`
			`mon_group_name: mons`

			`pre_tasks:`
			`- name: exit playbook, if only one monitor is present in cluster`
			`fail:`
			`msg: "You are about to shrink the only monitor present in the cluster.`
			`If you really want to do that, please use the purge-cluster playbook."`
			`when:`
			`- groups[mon_group_name] \| length \| int == 1`

			`- name: exit playbook, if no monitor was given`
			`fail:`
			`msg: "mon_to_kill must be declared`
			`Exiting shrink-cluster playbook, no monitor was removed.`
			`On the command line when invoking the playbook, you can use`
			`-e mon_to_kill=ceph-mon01 argument. You can only remove a single monitor each time the playbook runs."`
			`when:`
			`- mon_to_kill is not defined`

			`- name: exit playbook, if the monitor is not part of the inventory`
			`fail:`
			`msg: "It seems that the host given is not part of your inventory, please make sure it is."`
			`when:`
			`- mon_to_kill not in groups[mon_group_name]`

			`- name: exit playbook, if user did not mean to shrink cluster`
			`fail:`
			`msg: "Exiting shrink-mon playbook, no monitor was removed.`
			`To shrink the cluster, either say 'yes' on the prompt or`
			or use `-e ireallymeanit=yes` on the command line when
			`invoking the playbook"`
			`when:`
			`- ireallymeanit != 'yes'`

			`roles:`
			`- ceph-defaults`
introduce new role ceph-facts sometimes we play the whole role `ceph-defaults` just to access the default value of some variables. It means we play the `facts.yml` part in this role while it's not desired. Splitting this role will speedup the playbook. Closes: #3282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0eb56e36f8ce52015aa6c343faccd589e5fd2c6c) 2018-12-10 22:46:32 +08:00			`- ceph-facts`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00
			`post_tasks:`
			`- name: pick a monitor different than the one we want to remove`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`set_fact:`
			`mon_host: "{{ item }}"`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`with_items: "{{ groups[mon_group_name] }}"`
			`when:`
			`- item != mon_to_kill`

shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`- name: set_fact docker_exec_cmd build docker exec command (containerized)`
			`set_fact:`
			`docker_exec_cmd: "docker exec ceph-mon-{{ hostvars[mon_host]['ansible_hostname'] }}"`
			`when: containerized_deployment`

shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`- name: exit playbook, if can not connect to the cluster`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`command: "{{ docker_exec_cmd }} timeout 5 ceph --cluster {{ cluster }} health"`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`register: ceph_health`
			`until: ceph_health.stdout.find("HEALTH") > -1`
			`delegate_to: "{{ mon_host }}"`
			`retries: 5`
			`delay: 2`

shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`- name: set_fact mon_to_kill_hostname`
			`set_fact:`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`mon_to_kill_hostname: "{{ hostvars[mon_to_kill]['ansible_hostname'] }}"`

			`- name: stop monitor service(s)`
			`service:`
			`name: ceph-mon@{{ mon_to_kill_hostname }}`
			`state: stopped`
			`enabled: no`
			`delegate_to: "{{ mon_to_kill }}"`
			`failed_when: false`

			`- name: purge monitor store`
			`file:`
			`path: /var/lib/ceph/mon/{{ cluster }}-{{ mon_to_kill_hostname }}`
			`state: absent`
			`delegate_to: "{{ mon_to_kill }}"`

			`- name: remove monitor from the quorum`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} mon remove {{ mon_to_kill_hostname }}"`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`failed_when: false`
			`delegate_to: "{{ mon_host }}"`

			`# NOTE (leseb): sorry for the 'sleep' command`
			`# but it will take a couple of seconds for other monitors`
			`# to notice that one member has left.`
			`# 'sleep 5' is not that bad and should be sufficient`
			`- name: verify the monitor is out of the cluster`
			`shell: \|`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`{{ docker_exec_cmd }} ceph --cluster {{ cluster }} -s -f json \| python -c 'import sys, json; print(json.load(sys.stdin)["quorum_names"])'`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`delegate_to: "{{ mon_host }}"`
shrink-mon: wait a little bit for the mon to be out Monitor removal from the monmap is not immediate, so let's wait a little bit and then fail if the monitor is still in the monmap. We try twice in total with 10 sec intervals. Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-05 04:13:17 +08:00			`failed_when: false`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`register: result`
shrink-mon: wait a little bit for the mon to be out Monitor removal from the monmap is not immediate, so let's wait a little bit and then fail if the monitor is still in the monmap. We try twice in total with 10 sec intervals. Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-05 04:13:17 +08:00			`until: mon_to_kill_hostname not in result.stdout`
			`retries: 2`
			`delay: 10`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00
			`- name: please remove the monitor from your ceph configuration file`
			`debug:`
			`msg: "The monitor has been successfully removed from the cluster.`
			`Please remove the monitor entry from the rest of your ceph configuration files, cluster wide."`
			`run_once: true`
			`when:`
			`- mon_to_kill_hostname not in result.stdout`

			`- name: fail if monitor is still part of the cluster`
			`fail:`
			`msg: "Monitor appears to still be part of the cluster, please check what happened."`
			`run_once: true`
			`when:`
			`- mon_to_kill_hostname in result.stdout`

			`- name: show ceph health`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} -s"`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`delegate_to: "{{ mon_host }}"`

			`- name: show ceph mon status`
shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-09-18 23:45:08 +08:00			`command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} mon stat"`
shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com> 2017-08-31 06:07:28 +08:00			`delegate_to: "{{ mon_host }}"`