Ansible playbooks to deploy Ceph, the distributed filesystem.
 
 
 
 
Go to file
Guillaume Abrioux 52ff9ce5d1 facts: add a retry on get current fsid task
sometimes it can happen the following task fails:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-centos-container-update/roles/ceph-facts/tasks/facts.yml:78
Wednesday 19 June 2019  18:12:49 +0000 (0:00:00.203)       0:02:39.995 ********
fatal: [mon2 -> mon1]: FAILED! => changed=true
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.239339'
  end: '2019-06-19 18:12:49.812099'
  msg: non-zero return code
  rc: 22
  start: '2019-06-19 18:12:49.572760'
  stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

not sure exactly why since just before this task, mon1 seems to be well
UP otherwise it wouldn't have passed the task `waiting for the
containerized monitor to join the quorum`.

As a quick fix/workaround, let's add a retry which allows us to get
around this situation:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-scenario/roles/ceph-facts/tasks/facts.yml:78
Thursday 20 June 2019  15:35:07 +0000 (0:00:00.201)       0:03:47.288 *********
FAILED - RETRYING: get current fsid (3 retries left).
changed: [mon2 -> mon1] => changed=true
  attempts: 2
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.290252'
  end: '2019-06-20 15:35:13.960188'
  rc: 0
  start: '2019-06-20 15:35:13.669936'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    {
        "fsid": "153e159d-7ade-42a7-842c-4d04348b901e"
    }
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 46a2683944)
2019-06-20 14:01:33 -04:00
.github/ISSUE_TEMPLATE Update issue templates 2018-07-12 14:10:15 +02:00
contrib remove ceph-agent role and references 2019-06-17 15:56:00 -04:00
docs switch to ansible 2.8 2019-05-21 09:17:46 +02:00
group_vars align cephfs pool creation 2019-06-18 09:17:13 +02:00
infrastructure-playbooks rolling_update: fail early if cluster state is not OK 2019-06-19 08:41:25 +00:00
library ansible: use 'bool' filter on boolean conditionals 2019-06-07 16:05:51 +02:00
plugins Add installer phase for dashboard roles 2019-06-18 13:14:04 +00:00
profiles Common: Add profiles 2017-07-19 11:50:03 +02:00
roles facts: add a retry on get current fsid task 2019-06-20 14:01:33 -04:00
tests align cephfs pool creation 2019-06-18 09:17:13 +02:00
.gitignore remove ceph-agent role and references 2019-06-17 15:56:00 -04:00
.mergify.yml mergify: need 2 approvals to merge a 'skip ci' PR 2019-02-28 13:07:51 +01:00
.travis.yml travis: Remove galaxy lint rules repository 2019-03-26 11:08:38 +00:00
CONTRIBUTING.md remove ceph-agent role and references 2019-06-17 15:56:00 -04:00
LICENSE Add Ceph Playbook 2014-03-03 19:08:51 +01:00
Makefile makefile: change distro to el8 2019-02-20 08:10:30 +00:00
README-MULTISITE.md rgw multisite: add more than 1 rgw to the master or secondary zone 2019-04-07 10:00:18 +00:00
README.rst Update Documentation example link to 3.0 2018-02-07 16:34:45 +01:00
Vagrantfile Fix units and add ability to have a dedicated instance 2019-06-12 11:48:12 +02:00
ansible.cfg tests: Update ansible ssh_args variable 2019-06-17 16:45:38 +02:00
ceph-ansible.spec.in spec: bring back possibility to install ceph with custom repo 2019-06-10 08:10:26 +02:00
dummy-ansible-hosts Fix Travis 2015-01-21 16:33:26 +01:00
generate_group_vars_sample.sh remove ceph-agent role and references 2019-06-17 15:56:00 -04:00
raw_install_python.yml improve coding style 2019-05-06 15:09:06 +00:00
requirements.txt switch to ansible 2.8 2019-05-21 09:17:46 +02:00
rhcs_edits.txt Update RHCS version with Nautilus 2019-05-13 16:23:24 +02:00
site-container.yml.sample Add installer phase for dashboard roles 2019-06-18 13:14:04 +00:00
site-docker.yml.sample site: symlink site-docker to site-container 2018-11-27 16:47:40 +00:00
site.yml.sample Add installer phase for dashboard roles 2019-06-18 13:14:04 +00:00
test.yml Remove spurious ceph. prefix for roles path in test.yml 2019-01-11 11:10:52 +01:00
tox-dashboard.ini tests: Update ansible ssh_args variable 2019-06-17 16:45:38 +02:00
tox-podman.ini tests: Update ansible ssh_args variable 2019-06-17 16:45:38 +02:00
tox-update.ini tests: Update ansible ssh_args variable 2019-06-17 16:45:38 +02:00
tox.ini tests: Update ansible ssh_args variable 2019-06-17 16:45:38 +02:00
vagrant_variables.yml.sample Fix units and add ability to have a dedicated instance 2019-06-12 11:48:12 +02:00

README.rst

ceph-ansible
============
Ansible playbooks for Ceph, the distributed filesystem.

Please refer to our hosted documentation here: http://docs.ceph.com/ceph-ansible/master/

You can view documentation for our ``stable-*`` branches by substituting ``master`` in the link
above for the name of the branch. For example: http://docs.ceph.com/ceph-ansible/stable-3.0/