The restart_osd_daemon.sh generated from the j2 template
contains a python call which uses 'print x' instead of
'print(x)'. Add the missing parentheses to make this call
compatible with both 2 and 3.
Also add parentheses to other python print calls found
in roles/ceph-client/defaults/main.yml and
infrastructure-playbooks/cluster-os-migration.yml.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1671721
Signed-off-by: John Fulton <fulton@redhat.com>
1ac94c048f introduced the support of
multiple rgw instances on a single host but somehow has missed to
implement this feature in rolling_update.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
With this, we could have multiple rgw instances on a single host
with a single run, don't have to use rgw-standalone.yml which does not
seems able to bind ports separately.
If you want to have multiple rgw instances, just change 'radosgw_instances'
to the number you want, which defaults to 1.
Not compatible with Multi-Site yet.
Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
There is no need to enforce `serial: 1` on client nodes.
Let's make it parameterizable by introducing a new *extra* variable
`client_update_batch`, if not filled this will default to `{{
ansible_forks }}`.
NOTE: this is only usable as an extra variable passed with
`-e client_update_batch=<num>`
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This task has to be called after the role `ceph-defaults` has been
played, otherwise, `mon_group_name` will never be known.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.
Closes: #3282
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
the OSD part of the purge delegates commands on monitor node, we need to
gather monitors facts to know the `ansible_hostname` fact that is used
in the `docker_exec_cmd` fact.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Recently we introduced the default collocation of mon/mgr without the
need of a dedicated mgrs section. This means we have to stop the mgr
process on that machine too.
Signed-off-by: Sébastien Han <seb@redhat.com>
Recently we introduced the collocation of mon and mgr by default, so we
don't need to have an explicit mgrs section for this. This means we have
to remove the mgr container on the mon machines too.
Signed-off-by: Sébastien Han <seb@redhat.com>
It's useful when running on CI to see what might remain on the machines.
So we list all the containers and images. We expect the list to be
empty.
We fail if we see containers running.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commits adds the support for purging cluster that were deployed
with ceph-volume. It also separates nicely with a block intruction the
work to do when lvm is used or not.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.
Signed-off-by: Sébastien Han <seb@redhat.com>
add iscsi support for both non containerized and containerized
deployment in purge playbooks.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
This failure condition was only valid at the time where clusters didn't
have ceph-mgr activated. Now since we collocate the ceph-mgr with the
mon by default, if the daemon wasn't present it will be created during
the upgrade.
Signed-off-by: Sébastien Han <seb@redhat.com>
try to create the potentially missing keys only on monitors that are
actually running.
The current node being played is stopped before this task.
By the way, delegating the command on all nodes but the current node
being played ensures that the generated keys will be present on all
monitors.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
So we can avoid the following failure:
The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout | from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout | from_json)["quorum_names"]
' failed. The error was: No JSON object could be decoded
We just need to set a default, the next iteration will have a more
complete json since the command won't fail.
Signed-off-by: Sébastien Han <seb@redhat.com>
when upgrading from RHCS 2.5 to 3.2, it fails because the task `create
ceph mgr keyring(s) when mon is containerized` has a when condition
`inventory_hostname == groups[mon_group_name]|last`.
First, this is incorrect because `inventory_hostname` is referring to a
mgr node, it means this condition would have never been satisfied.
Then, this condition + `serial: 1` makes the mgr keyring creating skipped on
the first node. Further, the `ceph-mgr` role tries to copy the mgr
keyring (it's not aware we are running `serial: 1`) this leads to a
failure like the following:
```
TASK [ceph-mgr : copy ceph keyring(s) if needed] ***************************************************************************************************************************************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10
Tuesday 27 November 2018 12:03:34 +0000 (0:00:00.296) 0:11:01.290 ******
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'
failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"}
```
The ceph_key module is idempotent, so there is no need to have such a
condition.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Use the new way to create keys on containerized env as introduced by: 1098b71bda90db3dad19ac179f0ba900ccb0f953
Signed-off-by: Sébastien Han <seb@redhat.com>
Use podman or docker wether they are available or not. podman will be
prioritized over docker if present.
Signed-off-by: Sébastien Han <seb@redhat.com>
`hostvars[groups[mon_host]]['ansible_hostname']` seems to be a typo.
That should be `hostvars[mon_host]['ansible_hostname']`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
each monitor node should select another monitor which isn't itself.
Otherwise, one node in the monitor group won't set this fact and causes
failure.
Typical error:
```
TASK [create potentially missing keys (rbd and rbd-mirror) when mon is containerized] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-update_docker_cluster/rolling_update.yml:200
Thursday 22 November 2018 14:02:30 +0000 (0:00:07.493) 0:02:50.005 *****
fatal: [mon1]: FAILED! => {}
MSG:
The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'mon2'
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
During an upgrade ceph won't create keys that were not existing on the
previous version. So after the upgrade of let's Jewel to Luminous, once
all the monitors have the new version they should get or create the
keys. It's ok to have the task fails, especially for the rbd-mirror
key, which only appears in Nautilus.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572
Signed-off-by: Sébastien Han <seb@redhat.com>
It's easier lookup a directoriy instead of the block devices,
especially because of ceph-volume and ceph-disk have a different way to
handle devices.
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this commit we were only disabling ceph-osd units, but forgot
the ceph.target which is controlling everything and will restart the
ceph-osd units at each reboot.
Now that everything gets disabled there won't be any conflicts between
the old non-container and the new container units.
Signed-off-by: Sébastien Han <seb@redhat.com>
If we mask it we won't be able to start the OSD container since now the
osd container use the osd ID as a name such as: ceph-osd@0
Fixes the error: Failed to execute operation: Cannot send after transport endpoint shutdown
Signed-off-by: Sébastien Han <seb@redhat.com>
Add quotes around package names added in the commit
da6f384223 so that the difference between
the Ansible variables and package names is clear.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Instead of looping over a list of packages or repeating the task
separately for different packages, pass the list of packages to the
task performing package management.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
This is needed for Nautilus since the ceph-create-keys script goes away.
(https://github.com/ceph/ceph/pull/21305)
Now the module if called with 'state: fetch_initial_keys' will lookup
keys generated by the monitor and write them down on the filesystem to
the right location (/etc/ceph and /var/lib/ceph/boostrap*).
This is not applicable to container since keys are generated by the
container only.
Signed-off-by: Sébastien Han <seb@redhat.com>
When the iscsi purge playbook is run we stop the gw and api daemons but
not tcmu-runner which I forgot on the previous PR.
Fixes Red Hat BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1621255
Signed-off-by: Mike Christie <mchristi@redhat.com>