ceph-ansible/roles/ceph-osd/tasks/start_osds.yml

---
- name: get osd id
  shell: |
    ls /var/lib/ceph/osd/ | sed 's/.*-//'
  changed_when: false
  failed_when: false
  check_mode: no
  register: osd_id
  until: osd_id.stdout_lines|length == devices|unique|length
  retries: 10
  when:
    - osd_scenario != 'lvm'

- name: ensure systemd service override directory exists
  file:
    state: directory
    path: "/etc/systemd/system/ceph-osd@.service.d/"
  when:
    - ceph_osd_systemd_overrides is defined
    - ansible_service_mgr == 'systemd'

- name: add ceph-osd systemd service overrides
  config_template:
    src: "ceph-osd.service.d-overrides.j2"
    dest: "/etc/systemd/system/ceph-osd@.service.d/ceph-osd-systemd-overrides.conf"
    config_overrides: "{{ ceph_osd_systemd_overrides | default({}) }}"
    config_type: "ini"
  when:
    - ceph_osd_systemd_overrides is defined
    - ansible_service_mgr == 'systemd'

- name: ensure osd daemons are started
  service:
    name: ceph-osd@{{ item }}
    state: started
    enabled: true
  with_items: "{{ (osd_id|default({})).stdout_lines|default([]) }}"
  changed_when: false
lvm_osds: ensure osd daemons are started Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-07-26 05:48:13 +08:00			`---`
			`- name: get osd id`
			`shell: \|`
			`ls /var/lib/ceph/osd/ \| sed 's/.*-//'`
			`changed_when: false`
			`failed_when: false`
Use check_mode instead of always_run This patch changes the `always_run: yes` task option to `check_mode: no` to avoid Ansible warnings. 2017-10-25 22:53:34 +08:00			`check_mode: no`
lvm_osds: ensure osd daemons are started Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-07-26 05:48:13 +08:00			`register: osd_id`
ceph-osd: Fix osd start sequence The script can fail to get the osd id because the osds are activated by udev and it can take a while for them to activate. This commit fixes that by trying to get all the osds per node in a loop. This commit also makes the osd services enabled so that they are available after reboot. Signed-off-by: Boris Ranto <branto@redhat.com> 2017-08-24 06:23:25 +08:00			`until: osd_id.stdout_lines\|length == devices\|unique\|length`
			`retries: 10`
Do not search osd ids if ceph-volume Description of problem: The 'get osd id' task goes through all the 10 times (and its respective timeouts) to make sure that the number of OSDs in the osd directory match the number of devices. This happens always, regardless if the setup and deployment is correct. Version-Release number of selected component (if applicable): Surely the latest. But any ceph-ansible version that contains ceph-volume support is affected. How reproducible: 100% Steps to Reproduce: 1. Use ceph-volume (LVM) to deploy OSDs 2. Avoid using anything in the 'devices' section 3. Deploy the cluster Actual results: TASK [ceph-osd : get osd id _uses_shell=True, _raw_params=ls /var/lib/ceph/osd/ \| sed 's/.-//'] ********************************************************************************************************************************************* task path: /Users/alfredo/python/upstream/ceph/src/ceph-volume/ceph_volume/tests/functional/lvm/.tox/xenial-filestore-dmcrypt/tmp/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:6 FAILED - RETRYING: get osd id (10 retries left). FAILED - RETRYING: get osd id (9 retries left). FAILED - RETRYING: get osd id (8 retries left). FAILED - RETRYING: get osd id (7 retries left). FAILED - RETRYING: get osd id (6 retries left). FAILED - RETRYING: get osd id (5 retries left). FAILED - RETRYING: get osd id (4 retries left). FAILED - RETRYING: get osd id (3 retries left). FAILED - RETRYING: get osd id (2 retries left). FAILED - RETRYING: get osd id (1 retries left). ok: [osd0] => { "attempts": 10, "changed": false, "cmd": "ls /var/lib/ceph/osd/ \| sed 's/.*-//'", "delta": "0:00:00.002717", "end": "2018-01-21 18:10:31.237933", "failed": true, "failed_when_result": false, "rc": 0, "start": "2018-01-21 18:10:31.235216" } STDOUT: 0 1 2 Expected results: There aren't any (or just a few) timeouts while the OSDs are found Additional info: This is happening because the check is mapping the number of "devices" defined for ceph-disk (in this case it would be 0) to match the number of OSDs found. Basically this line: until: osd_id.stdout_lines\|length == devices\|unique\|length Means in this 2 OSD case it is trying to ensure the following incorrect condition: until: 2 == 0 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537103 2018-01-29 21:28:23 +08:00			`when:`
			`- osd_scenario != 'lvm'`
lvm_osds: ensure osd daemons are started Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-07-26 05:48:13 +08:00
Allow ceph service systemd overrides to be specified ceph services can fail to start under certain circumstances (for example, when running in a container) because the default systemd service configuration causes namespace issues. To work around this we can override the system service settings by placing an overrides file in the ceph-<service>@.service.d directory. This can be generic so as to allow any potential changes required to the ceph-<service> service files. The overrides file is only setup when the "ceph_<service>_systemd_overrides" config_template override variable is specified. The available service systemd override files are as follows: ceph_mds_systemd_overrides ceph_mgr_systemd_overrides ceph_mon_systemd_overrides ceph_osd_systemd_overrides ceph_rbd_mirror_systemd_overrides ceph_rgw_systemd_overrides 2017-07-05 21:47:48 +08:00			`- name: ensure systemd service override directory exists`
			`file:`
			`state: directory`
			`path: "/etc/systemd/system/ceph-osd@.service.d/"`
			`when:`
			`- ceph_osd_systemd_overrides is defined`
			`- ansible_service_mgr == 'systemd'`

			`- name: add ceph-osd systemd service overrides`
			`config_template:`
			`src: "ceph-osd.service.d-overrides.j2"`
			`dest: "/etc/systemd/system/ceph-osd@.service.d/ceph-osd-systemd-overrides.conf"`
			`config_overrides: "{{ ceph_osd_systemd_overrides \| default({}) }}"`
			`config_type: "ini"`
			`when:`
			`- ceph_osd_systemd_overrides is defined`
			`- ansible_service_mgr == 'systemd'`

lvm: update scenario for new osd_scenario variable Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-08-03 23:08:34 +08:00			`- name: ensure osd daemons are started`
lvm_osds: ensure osd daemons are started Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-07-26 05:48:13 +08:00			`service:`
			`name: ceph-osd@{{ item }}`
			`state: started`
ceph-osd: Fix osd start sequence The script can fail to get the osd id because the osds are activated by udev and it can take a while for them to activate. This commit fixes that by trying to get all the osds per node in a loop. This commit also makes the osd services enabled so that they are available after reboot. Signed-off-by: Boris Ranto <branto@redhat.com> 2017-08-24 06:23:25 +08:00			`enabled: true`
lvm_osds: ensure osd daemons are started Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2017-07-26 05:48:13 +08:00			`with_items: "{{ (osd_id\|default({})).stdout_lines\|default([]) }}"`
			`changed_when: false`