add iscsi support for both non containerized and containerized
deployment in purge playbooks.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.
Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 452069cb3a)
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.
Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.
Signed-off-by: Sébastien Han <seb@redhat.com>
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
This is useful in situations where you fetch the key from the mon store
and want to write the file with a different name to a dedicated
directory. This is important when fetching the mgr key, they are created
as mgr.ceph-mon2 but we want them in /var/lib/ceph/mgr/ceph-ceph-mon0/keyring
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit unifies the container and non-container code, which in the
meantime gives use the ability to deploy N mon container at the same
time without having to serialized the deployment. This will drastically
reduces the time needed to bootstrap the cluster.
Note, this is only possible since Nautilus because the monitors are
bootstrap the initial keys on their own once they reach quorum. In the
Nautilus version of the ceph-container mon, we stopped generating the
keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238
Signed-off-by: Sébastien Han <seb@redhat.com>
When collocating mon and mgr, the mgr container will attempt to create
its own key since it has the admin key at its disposal. Also at this
point there is nothing to fetch since the key is not created by the
mons, as mentionned above the mgr creates the key on its own.
Signed-off-by: Sébastien Han <seb@redhat.com>
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
During the first iteration, the command won't return anything, or can
simply fail and might not return a valid json structure. Ansible will
fail parsing it in the filter `from_json` so let's default that variable
to empty dictionary.
Signed-off-by: Sébastien Han <seb@redhat.com>
This removes a bit of unnecessary code, the check was always wrong
because of the condition 'not ceph_current_status.get('rc', 1) == 0'
It will never match since `Not` is used for bool and we are checking for
an rc.
Also, even though the check would work, this will be a major blocker for
a complete meltdown. If the whole platform is shutdown then nothing will
be up but files will be present, so this check is definitely wrong.
Signed-off-by: Sébastien Han <seb@redhat.com>
This failure condition was only valid at the time where clusters didn't
have ceph-mgr activated. Now since we collocate the ceph-mgr with the
mon by default, if the daemon wasn't present it will be created during
the upgrade.
Signed-off-by: Sébastien Han <seb@redhat.com>
Instead of applying file permissions from our code, let's rely on the
ansible code 'file' module for this. This is now handled at the task
declaration level instead of inside the module.
Signed-off-by: Sébastien Han <seb@redhat.com>
We need to apply any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.
Signed-off-by: Sébastien Han <seb@redhat.com>
This is false, `./defaults/main.yml` is not supposed to be modified
directly. groups_vars a/o host_vars should always be preferred.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
try to create the potentially missing keys only on monitors that are
actually running.
The current node being played is stopped before this task.
By the way, delegating the command on all nodes but the current node
being played ensures that the generated keys will be present on all
monitors.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This will fix the following yamllint warning:
Variables should have spaces after {{ and before }}
Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
So we can avoid the following failure:
The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout | from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout | from_json)["quorum_names"]
' failed. The error was: No JSON object could be decoded
We just need to set a default, the next iteration will have a more
complete json since the command won't fail.
Signed-off-by: Sébastien Han <seb@redhat.com>
change default value of `radosgw_address` to keep consistency with
`monitor_address`.
Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine
if it has to run `check_eth_rgw.yml`.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
bring the recent refact about `osd_pool_default_pg_num` and
`osd_pool_default_size` into podman scenario as well.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
the first cluster is using `latest-master` while the second is using
`latest` which is not the right version to be used here.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
we must apply this playbook before deploying the secondary cluster.
Otherwise, there will be a mismatch between the two deployed cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
when upgrading from RHCS 2.5 to 3.2, it fails because the task `create
ceph mgr keyring(s) when mon is containerized` has a when condition
`inventory_hostname == groups[mon_group_name]|last`.
First, this is incorrect because `inventory_hostname` is referring to a
mgr node, it means this condition would have never been satisfied.
Then, this condition + `serial: 1` makes the mgr keyring creating skipped on
the first node. Further, the `ceph-mgr` role tries to copy the mgr
keyring (it's not aware we are running `serial: 1`) this leads to a
failure like the following:
```
TASK [ceph-mgr : copy ceph keyring(s) if needed] ***************************************************************************************************************************************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10
Tuesday 27 November 2018 12:03:34 +0000 (0:00:00.296) 0:11:01.290 ******
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'
failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"}
```
The ceph_key module is idempotent, so there is no need to have such a
condition.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This is not needed to play these tasks on nodes that are not in rgw
group.
Always playing this code makes `shrink_mon.yml` failing.
Typical error:
```
TASK [ceph-defaults : set_fact _radosgw_address to radosgw_interface - ipv4] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-shrink_mon/roles/ceph-defaults/tasks/set_radosgw_address.yml:21
Thursday 22 November 2018 12:34:51 +0000 (0:00:00.154) 0:00:12.371 *****
fatal: [localhost]: FAILED! => {}
MSG:
The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1'
```
Indeed, `radosgw_interface` is the network interface on rgw only. It is
expected that this same interface doesn't exist on `localhost`, so, when
running `shrink_mon.yml`, the role `ceph-defaults` is called in
`hosts: localhost` and causes the playbook to fail.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
It seems Atomic 7.5 has podman already, however this is an old version
(0.4). The podman integration is targetting RHEL 8, so Fedora is
currently the closest to that.
Signed-off-by: Sébastien Han <seb@redhat.com>
Use the new way to create keys on containerized env as introduced by: 1098b71bda90db3dad19ac179f0ba900ccb0f953
Signed-off-by: Sébastien Han <seb@redhat.com>
Use podman or docker wether they are available or not. podman will be
prioritized over docker if present.
Signed-off-by: Sébastien Han <seb@redhat.com>