Commit Graph

4529 Commits (6b5487d1e56f48dbba7305079465c32c8231663e)
 

Author SHA1 Message Date
Noah Watkins 3cf5fd2c3e start_osds: use list instead of keys (re-introduce)
the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  82a6b5adec

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-12-05 23:25:35 +00:00
Rishabh Dave 2fb12ae554 use pre_tasks and post_tasks when necessary
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-05 08:17:10 +00:00
Rishabh Dave e4f0af2b78 don't use private option for import_role
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 23:45:59 +00:00
Sébastien Han efefcc3fb2 tox: add missing ceph_docker_image_tag in shrink
When calling shrink on containerized deployment, we were first doing the
setup with `latest-master` and then when calling the playbook we were
using the default value for `ceph_docker_image_tag` that comes from
ceph-defaults. Now we pass
`ceph_docker_image_tag={env:CEPH_DOCKER_IMAGE_TAG:latest-master}` so the
play will run the right container image.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 19:01:25 +01:00
Rishabh Dave 3d51e1065d validate: expand all jinja2 templates before validating them
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 17:29:18 +00:00
Christian Berendt 6f994195d1 travis: remove sudo: required
The sudo keyword will be fully deprecated.

https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration
Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
2018-12-04 17:10:58 +01:00
Guillaume Abrioux 1cff1f9806 revert infra: don't restart firewalld if unit is masked
If firewalld unit is masked, setting `configure_firewall: false` is
enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-04 15:52:08 +00:00
Ramana Raja cb784c601d rolling_update: fail if less than 3 MONs
... for non-containerized deployments as well.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655470

Signed-off-by: Ramana Raja <rraja@redhat.com>
2018-12-04 14:28:49 +00:00
Sébastien Han 14cd286e3a test: disable nfs for containers
Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Sébastien Han 896676ee80 fix json data type
Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Sébastien Han 1b6b275229 test: remove leftover [mgrs]
Since we now collocated mgrs and mons on the same machine we have to
remove the mgrs section, they are not needed anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Guillaume Abrioux b04fe72f35 tests: add purge_lvm_osds_container scenario
This commits adds the purge_lvm_osds_container scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 17:35:21 +01:00
Guillaume Abrioux 78116fa6db purge: add iscsi support
add iscsi support for both non containerized and containerized
deployment in purge playbooks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 17:35:21 +01:00
Sébastien Han 82a6b5adec osd: manage legacy ceph-disk non-container startup
The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.

Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 452069cb3a)
2018-12-03 16:01:57 +01:00
Sébastien Han ec2d1f502d osd: re-introduce disk_list check
This commit
4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)
accidentally removed this check.

This is a must have for ceph-disk based containerized OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 9b5a93e3a5)
2018-12-03 16:01:57 +01:00
Sébastien Han 4c51130198 osd: discover osd_objectstore on the fly
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bef522627e ceph-osd: change jinja condition
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bf375327a0 ceph-mgr: refact role for containers
Now we simplify the invocation of start and remove some code and the
directory 'docker'.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 23f685b352 ceph_key: allow setting 'dest' to a file
This is useful in situations where you fetch the key from the mon store
and want to write the file with a different name to a dedicated
directory. This is important when fetching the mgr key, they are created
as mgr.ceph-mon2 but we want them in /var/lib/ceph/mgr/ceph-ceph-mon0/keyring

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 14fc5bad12 mon: do not serialized container bootstrap
This commit unifies the container and non-container code, which in the
meantime gives use the ability to deploy N mon container at the same
time without having to serialized the deployment. This will drastically
reduces the time needed to bootstrap the cluster.
Note, this is only possible since Nautilus because the monitors are
bootstrap the initial keys on their own once they reach quorum. In the
Nautilus version of the ceph-container mon, we stopped generating the
keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 61082b3b32 mgr: only copy keys with dedicated mgr
When collocating mon and mgr, the mgr container will attempt to create
its own key since it has the admin key at its disposal. Also at this
point there is nothing to fetch since the key is not created by the
mons, as mentionned above the mgr creates the key on its own.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 1c760904b0 site: collocated mon and mgr by default
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a502327e52 disable nfs scenario
The packages are broken, so let's remove it, until this solved.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han ee1905ad31 mon: add missing include_tasks instead of import_tasks
This was probably a leftover/mistake so let's fix this and make the file
consistent.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7cb1040440 config: add missing bootstrap mgr directory
This directory is needed so we can fetch the bootstrap mgr key in it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 8d4de44f5d mon: default ceph_health_raw to json
During the first iteration, the command won't return anything, or can
simply fail and might not return a valid json structure. Ansible will
fail parsing it in the filter `from_json` so let's default that variable
to empty dictionary.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han cfac79bec4 container-common: remove old check
This removes a bit of unnecessary code, the check was always wrong
because of the condition 'not ceph_current_status.get('rc', 1) == 0'
It will never match since `Not` is used for bool and we are checking for
an rc.
Also, even though the check would work, this will be a major blocker for
a complete meltdown. If the whole platform is shutdown then nothing will
be up but files will be present, so this check is definitely wrong.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han bb7bfca113 rolling-update: remove old condition
This failure condition was only valid at the time where clusters didn't
have ceph-mgr activated. Now since we collocate the ceph-mgr with the
mon by default, if the daemon wasn't present it will be created during
the upgrade.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a42ba03d71 ceph_volume: fix unit tests
Fix the container_binary to use by mocking the CEPH_CONTAINER_BINARY env
variable.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 3d0670b41c ceph_key: apply permissions using ansible code module
Instead of applying file permissions from our code, let's rely on the
ansible code 'file' module for this. This is now handled at the task
declaration level instead of inside the module.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7ac73202f7 fw: update rules for mon/mgr collocation
Since we now deploy mgr on mon we need to open fw rules so the mgr can
reach out to the osds.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 5b9d8f9737 mon: remove old ubuntu login status
We don't support Ubuntu Precise, so this feature does not exists
anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han acc92626f6 sites: fail the playbook on any failure
We need to apply   any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 5816cd4101 site-container: retry image pull
Sometimes pulling an image fails for network hickup, so let's retry 5
times at 5sec interval.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han bd98193ce4 travis: run modules unit tests
Travis now runs our modules unit tests to make sure they always pass.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a0e5ef8516 mon: secure cluster on container
Add the ability to protect pools on containerized clusters.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Guillaume Abrioux ccc0c9c24c osd: remove a leftover
this file is never included in ceph-osd, looks like a leftover let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 09:12:02 +01:00
Guillaume Abrioux 0187166926 osd: remove an incorrect information
This is false, `./defaults/main.yml` is not supposed to be modified
directly. groups_vars a/o host_vars should always be preferred.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 08:11:35 +00:00
Guillaume Abrioux fead0813b4 remove kv store support
the next stable release will drop this feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-30 13:45:12 +00:00
Guillaume Abrioux a952122c38 rolling_update: create missing keyring only on running mon
try to create the potentially missing keys only on monitors that are
actually running.
The current node being played is stopped before this task.
By the way, delegating the command on all nodes but the current node
being played ensures that the generated keys will be present on all
monitors.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-29 16:40:46 +00:00
Christian Berendt 1f73a9900f Add missing space before }}
This will fix the following yamllint warning:

Variables should have spaces after {{ and before }}

Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
2018-11-29 16:04:05 +01:00
Guillaume Abrioux a86c2b8526 config: write jinja comment with appropriate syntax
jinja comment should be written using the jinja syntax `{# ... #}`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-29 15:48:23 +01:00
Sébastien Han 61fb6972ec rolling_update: default ceph json output to empty dict
So we can avoid the following failure:

The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout | from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout | from_json)["quorum_names"]
' failed. The error was: No JSON object could be decoded

We just need to set a default, the next iteration will have a more
complete json since the command won't fail.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-29 10:46:15 +00:00
Guillaume Abrioux e4869ac8bd validate: change default value for `radosgw_address`
change default value of `radosgw_address` to keep consistency with
`monitor_address`.
Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine
if it has to run `check_eth_rgw.yml`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 23:13:38 +01:00
Guillaume Abrioux 96ce8761ba tests: rgw_multisite allow clusters to talk to each other
Adding this rule on the hypervisor will allow cluster to talk to each
other.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 21:53:59 +00:00
Guillaume Abrioux 5d05a09b03 tests: update default pg num and pool size for podman scenario
bring the recent refact about `osd_pool_default_pg_num` and
`osd_pool_default_size` into podman scenario as well.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 11:22:04 +00:00
Guillaume Abrioux 5af5ad6d61 tests: fix image tag for secondary rgw cluster (rgw_multisite)
the first cluster is using `latest-master` while the second is using
`latest` which is not the right version to be used here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 11:22:04 +00:00
Guillaume Abrioux 3d8f4e6304 tests: apply dev_setup on the secondary cluster for rgw_multisite
we must apply this playbook before deploying the secondary cluster.
Otherwise, there will be a mismatch between the two deployed cluster.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 11:22:04 +00:00
Guillaume Abrioux 73287f91bc mgr: fix mgr keyring error on rolling_update
when upgrading from RHCS 2.5 to 3.2, it fails because the task `create
ceph mgr keyring(s) when mon is containerized` has a when condition
`inventory_hostname == groups[mon_group_name]|last`.
First, this is incorrect because `inventory_hostname` is referring to a
mgr node, it means this condition would have never been satisfied.
Then, this condition + `serial: 1` makes the mgr keyring creating skipped on
the first node. Further, the `ceph-mgr` role tries to copy the mgr
keyring (it's not aware we are running `serial: 1`) this leads to a
failure like the following:

```
TASK [ceph-mgr : copy ceph keyring(s) if needed] ***************************************************************************************************************************************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10
Tuesday 27 November 2018  12:03:34 +0000 (0:00:00.296)       0:11:01.290 ******
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'
failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"}
```

The ceph_key module is idempotent, so there is no need to have such a
condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-27 18:19:56 +01:00
Sébastien Han bc2daaeb71 ceph-osd fix batch with container binary
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00