ceph-ansible/roles/ceph-osd/tasks/start_osds.yml

39 lines
1.1 KiB
YAML
Raw Normal View History

---
- name: get osd id
shell: |
ls /var/lib/ceph/osd/ | sed 's/.*-//'
changed_when: false
failed_when: false
check_mode: no
register: osd_id
until: osd_id.stdout_lines|length == devices|unique|length
retries: 10
Do not search osd ids if ceph-volume Description of problem: The 'get osd id' task goes through all the 10 times (and its respective timeouts) to make sure that the number of OSDs in the osd directory match the number of devices. This happens always, regardless if the setup and deployment is correct. Version-Release number of selected component (if applicable): Surely the latest. But any ceph-ansible version that contains ceph-volume support is affected. How reproducible: 100% Steps to Reproduce: 1. Use ceph-volume (LVM) to deploy OSDs 2. Avoid using anything in the 'devices' section 3. Deploy the cluster Actual results: TASK [ceph-osd : get osd id _uses_shell=True, _raw_params=ls /var/lib/ceph/osd/ | sed 's/.*-//'] ********************************************************************************************************************************************** task path: /Users/alfredo/python/upstream/ceph/src/ceph-volume/ceph_volume/tests/functional/lvm/.tox/xenial-filestore-dmcrypt/tmp/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:6 FAILED - RETRYING: get osd id (10 retries left). FAILED - RETRYING: get osd id (9 retries left). FAILED - RETRYING: get osd id (8 retries left). FAILED - RETRYING: get osd id (7 retries left). FAILED - RETRYING: get osd id (6 retries left). FAILED - RETRYING: get osd id (5 retries left). FAILED - RETRYING: get osd id (4 retries left). FAILED - RETRYING: get osd id (3 retries left). FAILED - RETRYING: get osd id (2 retries left). FAILED - RETRYING: get osd id (1 retries left). ok: [osd0] => { "attempts": 10, "changed": false, "cmd": "ls /var/lib/ceph/osd/ | sed 's/.*-//'", "delta": "0:00:00.002717", "end": "2018-01-21 18:10:31.237933", "failed": true, "failed_when_result": false, "rc": 0, "start": "2018-01-21 18:10:31.235216" } STDOUT: 0 1 2 Expected results: There aren't any (or just a few) timeouts while the OSDs are found Additional info: This is happening because the check is mapping the number of "devices" defined for ceph-disk (in this case it would be 0) to match the number of OSDs found. Basically this line: until: osd_id.stdout_lines|length == devices|unique|length Means in this 2 OSD case it is trying to ensure the following incorrect condition: until: 2 == 0 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537103
2018-01-29 21:28:23 +08:00
when:
- osd_scenario != 'lvm'
- name: ensure systemd service override directory exists
file:
state: directory
path: "/etc/systemd/system/ceph-osd@.service.d/"
when:
- ceph_osd_systemd_overrides is defined
- ansible_service_mgr == 'systemd'
- name: add ceph-osd systemd service overrides
config_template:
src: "ceph-osd.service.d-overrides.j2"
dest: "/etc/systemd/system/ceph-osd@.service.d/ceph-osd-systemd-overrides.conf"
config_overrides: "{{ ceph_osd_systemd_overrides | default({}) }}"
config_type: "ini"
when:
- ceph_osd_systemd_overrides is defined
- ansible_service_mgr == 'systemd'
- name: ensure osd daemons are started
service:
name: ceph-osd@{{ item }}
state: started
enabled: true
with_items: "{{ (osd_id|default({})).stdout_lines|default([]) }}"
changed_when: false