Commit Graph

4593 Commits (567c6ceb43e2751dde735bccc5d5e9109ab6ad50)
 

Author SHA1 Message Date
Guillaume Abrioux ae7f3d66a6 purge-docker: do not call ceph-osd role
calling ceph-osd role in purge playbook is not needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-11 09:59:25 +01:00
Guillaume Abrioux 1b8b5e0aac meta: set the right minimum ansible version required for galaxy
ceph-ansible@master requires the latest stable ansible version.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-11 09:59:25 +01:00
Guillaume Abrioux 1a4a6ec855 purge: gather monitors facts in OSD purge
the OSD part of the purge delegates commands on monitor node, we need to
gather monitors facts to know the `ansible_hostname` fact that is used
in the `docker_exec_cmd` fact.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 62111ff53c purge-container: gather fact before calling ceph-defaults
ceph-defaults relies on facts so we must gather facts before running it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 656fbd2901 purge: tox add lvm-setup
Since we deploy > purge > deploy the LVs are gone so we much recreate
them.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han fc6ebd8ebb purge-cluster: add support for mon/mgr collocation
Recently we introduced the default collocation of mon/mgr without the
need of a dedicated mgrs section. This means we have to stop the mgr
process on that machine too.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 3a154fa0ad purge-cluster: remove support for other init system
We only support systemd and use the service module anyway.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 325a159415 purge-docker-cluster: add support for mgr/mon collocation
Recently we introduced the collocation of mon and mgr by default, so we
don't need to have an explicit mgrs section for this. This means we have
to remove the mgr container on the mon machines too.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 2bcc00896f purge-docker-cluste: add a task to check hosts
It's useful when running on CI to see what might remain on the machines.
So we list all the containers and images. We expect the list to be
empty.

We fail if we see containers running.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 1751885bc9 purge-docker-cluster: add ceph-volume support
This commits adds the support for purging cluster that were deployed
with ceph-volume. It also separates nicely with a block intruction the
work to do when lvm is used or not.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-11 09:59:25 +01:00
Noah Watkins 114fac15dc ceph_keys: pass in module for error messages
fixes: #3421

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-12-10 10:03:17 +01:00
Ali Maredia 5a12121cf7 rgw multisite: update documentation
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2018-12-10 09:25:43 +01:00
Sébastien Han 8e2585b6c7 ceph-defaults: do not use podman only on atomic
We want to test podman on f29 non-atomic, atomic is not a hard
requirement. However, if you want to get podman then you will have to
install it first before running the playbook.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-07 15:37:07 +01:00
Sébastien Han 51ca4f883b mgr: little refact
This commit removes the default module, so ceph-ansible does not enable
any manager module.
To enable a module you need to set a value to 'ceph_mgr_modules', you
can pass a list of modules like this:

ceph_mgr_modules:
  - status
  - dashboard

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-06 14:55:56 +00:00
Noah Watkins 3cf5fd2c3e start_osds: use list instead of keys (re-introduce)
the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  82a6b5adec

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-12-05 23:25:35 +00:00
Rishabh Dave 2fb12ae554 use pre_tasks and post_tasks when necessary
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-05 08:17:10 +00:00
Rishabh Dave e4f0af2b78 don't use private option for import_role
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 23:45:59 +00:00
Sébastien Han efefcc3fb2 tox: add missing ceph_docker_image_tag in shrink
When calling shrink on containerized deployment, we were first doing the
setup with `latest-master` and then when calling the playbook we were
using the default value for `ceph_docker_image_tag` that comes from
ceph-defaults. Now we pass
`ceph_docker_image_tag={env:CEPH_DOCKER_IMAGE_TAG:latest-master}` so the
play will run the right container image.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 19:01:25 +01:00
Rishabh Dave 3d51e1065d validate: expand all jinja2 templates before validating them
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 17:29:18 +00:00
Christian Berendt 6f994195d1 travis: remove sudo: required
The sudo keyword will be fully deprecated.

https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration
Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
2018-12-04 17:10:58 +01:00
Guillaume Abrioux 1cff1f9806 revert infra: don't restart firewalld if unit is masked
If firewalld unit is masked, setting `configure_firewall: false` is
enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-04 15:52:08 +00:00
Ramana Raja cb784c601d rolling_update: fail if less than 3 MONs
... for non-containerized deployments as well.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655470

Signed-off-by: Ramana Raja <rraja@redhat.com>
2018-12-04 14:28:49 +00:00
Sébastien Han 14cd286e3a test: disable nfs for containers
Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Sébastien Han 896676ee80 fix json data type
Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Sébastien Han 1b6b275229 test: remove leftover [mgrs]
Since we now collocated mgrs and mons on the same machine we have to
remove the mgrs section, they are not needed anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Guillaume Abrioux b04fe72f35 tests: add purge_lvm_osds_container scenario
This commits adds the purge_lvm_osds_container scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 17:35:21 +01:00
Guillaume Abrioux 78116fa6db purge: add iscsi support
add iscsi support for both non containerized and containerized
deployment in purge playbooks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 17:35:21 +01:00
Sébastien Han 82a6b5adec osd: manage legacy ceph-disk non-container startup
The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.

Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 452069cb3a)
2018-12-03 16:01:57 +01:00
Sébastien Han ec2d1f502d osd: re-introduce disk_list check
This commit
4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)
accidentally removed this check.

This is a must have for ceph-disk based containerized OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 9b5a93e3a5)
2018-12-03 16:01:57 +01:00
Sébastien Han 4c51130198 osd: discover osd_objectstore on the fly
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bef522627e ceph-osd: change jinja condition
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bf375327a0 ceph-mgr: refact role for containers
Now we simplify the invocation of start and remove some code and the
directory 'docker'.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 23f685b352 ceph_key: allow setting 'dest' to a file
This is useful in situations where you fetch the key from the mon store
and want to write the file with a different name to a dedicated
directory. This is important when fetching the mgr key, they are created
as mgr.ceph-mon2 but we want them in /var/lib/ceph/mgr/ceph-ceph-mon0/keyring

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 14fc5bad12 mon: do not serialized container bootstrap
This commit unifies the container and non-container code, which in the
meantime gives use the ability to deploy N mon container at the same
time without having to serialized the deployment. This will drastically
reduces the time needed to bootstrap the cluster.
Note, this is only possible since Nautilus because the monitors are
bootstrap the initial keys on their own once they reach quorum. In the
Nautilus version of the ceph-container mon, we stopped generating the
keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 61082b3b32 mgr: only copy keys with dedicated mgr
When collocating mon and mgr, the mgr container will attempt to create
its own key since it has the admin key at its disposal. Also at this
point there is nothing to fetch since the key is not created by the
mons, as mentionned above the mgr creates the key on its own.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 1c760904b0 site: collocated mon and mgr by default
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a502327e52 disable nfs scenario
The packages are broken, so let's remove it, until this solved.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han ee1905ad31 mon: add missing include_tasks instead of import_tasks
This was probably a leftover/mistake so let's fix this and make the file
consistent.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7cb1040440 config: add missing bootstrap mgr directory
This directory is needed so we can fetch the bootstrap mgr key in it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 8d4de44f5d mon: default ceph_health_raw to json
During the first iteration, the command won't return anything, or can
simply fail and might not return a valid json structure. Ansible will
fail parsing it in the filter `from_json` so let's default that variable
to empty dictionary.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han cfac79bec4 container-common: remove old check
This removes a bit of unnecessary code, the check was always wrong
because of the condition 'not ceph_current_status.get('rc', 1) == 0'
It will never match since `Not` is used for bool and we are checking for
an rc.
Also, even though the check would work, this will be a major blocker for
a complete meltdown. If the whole platform is shutdown then nothing will
be up but files will be present, so this check is definitely wrong.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han bb7bfca113 rolling-update: remove old condition
This failure condition was only valid at the time where clusters didn't
have ceph-mgr activated. Now since we collocate the ceph-mgr with the
mon by default, if the daemon wasn't present it will be created during
the upgrade.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a42ba03d71 ceph_volume: fix unit tests
Fix the container_binary to use by mocking the CEPH_CONTAINER_BINARY env
variable.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 3d0670b41c ceph_key: apply permissions using ansible code module
Instead of applying file permissions from our code, let's rely on the
ansible code 'file' module for this. This is now handled at the task
declaration level instead of inside the module.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7ac73202f7 fw: update rules for mon/mgr collocation
Since we now deploy mgr on mon we need to open fw rules so the mgr can
reach out to the osds.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 5b9d8f9737 mon: remove old ubuntu login status
We don't support Ubuntu Precise, so this feature does not exists
anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han acc92626f6 sites: fail the playbook on any failure
We need to apply   any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 5816cd4101 site-container: retry image pull
Sometimes pulling an image fails for network hickup, so let's retry 5
times at 5sec interval.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han bd98193ce4 travis: run modules unit tests
Travis now runs our modules unit tests to make sure they always pass.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a0e5ef8516 mon: secure cluster on container
Add the ability to protect pools on containerized clusters.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00