Commit Graph

114 Commits (debug-8)

Author SHA1 Message Date
Dimitri Savineau 150acba8c5 switch-from-non-containerized: stop all osds
e6bfb84 introduced a regression in the switch from non containerized
to container deployment.
We need to stop all previous OSDs services. We just don't need the
ceph-disk pattern in the regex.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-11 16:26:53 -04:00
Guillaume Abrioux e6bfb843f4 switch_to_containers: remove ceph-disk references
as of stable-4.0, ceph-disk is no longer supported.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Guillaume Abrioux 69310a5cd6 switch_to_containers: support multiple rgw instances per host
add multiple rgw instances per host in switch_to_containers playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 70f1eea9b2 switch_to_containers: remove non-containerized systemd unit files
remove old systemd unit files (non-containerized) during the
switch_to_containers transition.

We have seen sometimes the unit started is the old one instead of the
new systemd unit generated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 4064035a54 switch_to_containers: use ceph binary from container
use the ceph binary from the container instead of the host.
If the ceph CLI version isn't compatible between host and container
image, it can cause the CLI to hang.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 7e0a70f7a8 switch_to_containers: do not try to redeploy monitors
`ceph-mon` tries to redeploy monitors because it assumes it was not yet
deployed since `mon_socket_stat` and `ceph_mon_container_stat` are
undefined (indeed, we stop the daemon before calling `ceph-mon` in the
switch_to_containers playbook).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 0eb56e36f8 introduce new role ceph-facts
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-12 11:18:01 +01:00
Rishabh Dave 2fb12ae554 use pre_tasks and post_tasks when necessary
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-05 08:17:10 +00:00
Rishabh Dave e4f0af2b78 don't use private option for import_role
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 23:45:59 +00:00
Sébastien Han 2814d36c93 infra playbooks: use the right container binary
Use podman or docker wether they are available or not. podman will be
prioritized over docker if present.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han c14f9b78ff switch: do not look for devices anymore
It's easier lookup a directoriy instead of the block devices,
especially because of ceph-volume and ceph-disk have a different way to
handle devices.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-23 07:56:23 +00:00
Sébastien Han cd56dad9fa switch: disable all ceph units
Prior to this commit we were only disabling ceph-osd units, but forgot
the ceph.target which is controlling everything and will restart the
ceph-osd units at each reboot.
Now that everything gets disabled there won't be any conflicts between
the old non-container and the new container units.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-23 07:56:23 +00:00
Sébastien Han fe1d09925a switch: do not mask systemd unit
If we mask it we won't be able to start the OSD container since now the
osd container use the osd ID as a name such as: ceph-osd@0

Fixes the error:  Failed to execute operation: Cannot send after transport endpoint shutdown

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-23 07:56:23 +00:00
Guillaume Abrioux c783bc70da docker-common: rename role
rename `ceph-docker-common` role to `ceph-container-common`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-12 10:51:48 +01:00
Rishabh Dave 3f62fc585f don't use "role" or "roles" to include roles
Since import_role and include_role are more readable, explicit (about
the nature of inclusion) and flexible (allows placibf inclusion
anywhere) amongst the tasks, use them instead of using roles or role
keyword. Besides, these keywords also allow more arguments.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-10-31 09:38:59 +01:00
Guillaume Abrioux d8d3e55006 remove restapi role
As of `mimic`, restapi is no longer available because of manager daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-30 14:19:13 +01:00
Sébastien Han 9fccffa1ca switch: allow switch big clusters (more than 99 osds)
The current regex had a limitation of 99 OSDs, now this limit has been
removed and regardless the number of OSDs they will all be collected.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630430
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-10 16:35:30 -04:00
Noah Watkins 8dcc8d1434 Stringify ceph_docker_image_tag
This could be a numeric input, but is treated like a string leading to
runtime errors.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1635823

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-10-10 04:26:33 +00:00
Noah Watkins 306e308f13 Avoid using tests as filter
Fixes the deprecation warning:

  [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
  using `result|search` use `result is search`.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-10-10 04:26:33 +00:00
Sébastien Han bae0f41705 switch: copy initial mon keyring
We need to copy this key into /etc/ceph so when ceph-docker-common runs
it can fetch it to the ansible server. Previously the task wasn't not
failing because `fail_on_missing` was False before 2.5, so now it's True
hence the failure.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-03 13:58:53 +00:00
Guillaume Abrioux 03e76af7b4 switch: add missing call to ceph-handler role
Add missing call the ceph-handler role, otherwise we can't have
reference to variable registered from ceph-handler from other roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-03 13:58:53 +00:00
Guillaume Abrioux 54b02fe187 switch: support migration when cluster is scrubbing
Similar to c13a3c3 we must allow scrubbing when running this playbook.

In cluster with a large number of PGs, it can be expected some of them
scrubbing, it's a normal operation.
Preventing from scrubbing operation force to set noscrub flag.

This commit allows to switch from non containerized to containerized
environment even while PGs are scrubbing.

Closes: #3182

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-03 13:58:53 +00:00
Sébastien Han 49a4712485 switch: disable ceph-disk units
During the transition from jewel non-container to container old ceph
units are disabled. ceph-disk can still remain in some cases and will
appear as 'loaded failed', this is not a problem although operators
might not like to see these units failing. That's why we remove them if
we find them.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1577846
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-05-17 08:48:28 +02:00
Guillaume Abrioux adeecc51f8 switch: fix ceph_uid fact for osd
In addition to b324c17 this commit fix the ceph uid for osd role in the
switch from non containerized to containerized playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-30 08:15:18 +02:00
Sébastien Han 5fa92804f9 switch: resolve device path so we can umount the osd data dir
If we don't do this, umounting devices declared like this
/dev/disk/by-id/ata-QEMU_HARDDISK_QM00001

will fail like:

umount: /dev/disk/by-id/ata-QEMU_HARDDISK_QM000011: mountpoint not found

Since we append '1' (partition 1), this won't work.
So we need to resolved the link to get something like /dev/sdb and then
append 1 to /dev/sdb1

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-30 08:15:18 +02:00
Sébastien Han 767abb5de0 switch: fix ceph_uid fact
Latest is now centos not ubuntu anymore so the condition was wrong.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-30 08:15:18 +02:00
Sébastien Han 641f141c0f selinux: remove chcon calls
We know bindmount with the :z option at the end of the -v command so
this will basically run the exact same command as we used to run. So to
speak:

chcon -Rt svirt_sandbox_file_t /var/lib/ceph

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-19 14:59:37 +02:00
Andrew Schoen b613321c21 switch-to-containers: do not fail when stopping the nfs-ganesha service
If we're working with a jewel cluster then this service will not exist.

This is mainly a problem with CI testing because our tests are setup to
work with both jewel and luminous, meaning that eventhough we want to
test jewel we still have a nfs-ganesha host in the test causing these
tasks to run.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-01-06 14:07:55 +01:00
Andrew Schoen 0b4b60e3c9 switch-to-containers: do not fail when stopping the ceph-mgr daemon
If we are working with a jewel cluster ceph mgr does not exist
and this makes the playbook fail.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-01-06 14:07:55 +01:00
Sébastien Han 39bf102b64 switch: nicer way to check mon quorum
re-use the same syntax as rolling_udate.yml

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-17 10:54:36 +02:00
Sébastien Han 774697ebd8 infra: use the pg check in the right place
Use the pg check before doing the pg check, not on the quorum check.
Also never quote int when doing comparaison.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-09 17:25:41 +02:00
Sébastien Han 33a3aa0dda switch: check pgs only when num_pgs > 0
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:42:09 +02:00
Sébastien Han c3c63ae539 switch: rework and fix clean pg wait
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:39:09 +02:00
Sébastien Han 477f86e305 switch to container: fix ceph nfs
The service is nfs-ganesha where ceph-nfs@{{ ansible_hostname }} will be
the name of the container.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 22:43:50 +02:00
Sébastien Han fdacac9fa0 switch: make osd collection idempotent
This commits allows us to run
switch-from-non-containerized-to-containerized-ceph-daemons.yml multiple
times.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489353
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 11:31:47 +02:00
Sébastien Han e46440e19c switch-from-non-containerized-to-containerized: fix devices
If devices is passed through an extra var this register won't work so
let's only register the var is devices is not defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489099
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-07 23:18:14 +02:00
Guillaume Abrioux d987d26719 tests: force docker variable for switch-to-containers scenario
we need to force the value of `docker` variable which is initially set
to `false` since it's a migration from non-containerized to
containerized cluster.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-06 18:03:52 +02:00
Sébastien Han b7db600caa switch-from-non-containerized-to-containerized: mask unit files
We must mask the image so we are sure that even if the system reboots
then the OSDs won't start.

Also remove Ceph udev rules if found on the system prior to deploy
containers. If we don't do this we are exposed to conflicts between udev
rules and sytemd unit files.

Also add the CI will now test the migration from a non-containerized cluster to a
containerized cluster.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-05 15:20:31 +02:00
Sébastien Han e0a264c7e9 osd: allow multi dedicated journals for containers
Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-30 12:34:06 +02:00
Sébastien Han 4f0ecb7f30 switch-from-non-containerized-to-containerized: simplify
This commit eases the use of the
infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml
playbook. We basically run it with a couple of pre-tasks and then we let
the playbook run the docker roles.

It obviously expect to have proper variables configured in order to
work.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-23 18:39:45 +02:00
Giulio Fidente 2c01de4350 Default cluster to ceph in switch to containers 2017-08-22 13:13:36 +02:00
Giulio Fidente f0423b1804 Parse ceph_docker_registry in switch to containers
Defaults it to docker.io as it was for backward compatibility.
2017-08-22 13:11:27 +02:00
Giulio Fidente a59b84d5c9 Assume mon_docker_privileged false in switch to containers 2017-08-22 13:01:25 +02:00
Giulio Fidente 0106fa6835 Consume public_network vs ceph_mon_docker_subnet
In the switch to containers migration there were broken references
to ceph_mon_docker_subnet variable, replaced with public_network.

Also fixes references to ceph_mon_docker_extra_env setting for it
a default as it could be undefined.
2017-08-21 18:34:24 +02:00
Giulio Fidente 386303d42e Extend set_uid fact to support RH Ceph images 2017-08-21 18:32:08 +02:00
Guillaume Abrioux 896d62d78b Refact: remove ceph_mon_docker_interface variable
remove `ceph_mon_docker_interface` and use `monitor_interface` instead
for both containerized and non-containerized deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-04 18:08:59 +02:00
Guillaume Abrioux 73141118d0 Make the new check PGs working with /bin/sh
The new test in the checks PGs are no longer working on distributions
where /bin/sh isn't linked to /bin/bash.

Fix: #1619
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-06-22 17:59:38 +02:00
Guillaume Abrioux 5af9bb432c rewrite check pgs clean tasks
Avoid screen scrapping by rewriting `waiting for clean pgs` tasks like it is
done in 304de48.

Use the json output returned by `ceph -s` instead

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-06-13 09:48:56 +02:00
Sébastien Han c37aaa41f4 playbook: homogenize the way list osd ids
Problem: too many different commands to do the same thing. The 'cut'
command on infrastructure-playbooks/purge-cluster.yml was also wrong.
This sed command from osixia in ceph-docker
https://github.com/ceph/ceph-docker/pull/580/ addresses all the
scenarios.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-03-30 11:51:38 +02:00
Konstantin Shalygin 1662976fc0
Resolve issues when groups names not in default value. 2017-03-27 21:44:30 +07:00
Sébastien Han 8320c14191 Merge pull request #1317 from ibotty/harmonize-docker-names
harmonize docker names
2017-03-14 18:20:20 +01:00
Andrew Schoen e81d690aa0 switch-to-containers: do not include group vars or role defaults
Doing so will override any values set for these in the group_vars
directory relative to the users inventory.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-08 08:57:09 -06:00
Andrew Schoen aef54d89d9 switch-to-containers: do not set group name vars at playbook level
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-08 08:57:09 -06:00
Tobias Florek 931027e6f7 harmonize docker names
Created containers now are named more or less in the form of

    <ansible role>-<ansible_hostname>
2017-02-23 09:15:05 +01:00
Sébastien Han c2f1dca823 docker: use a better method to pull images
We changed the way we declare image.
Prior to this patch we must have a "user/image:tag"
format, which is incompatible with non docker-hub registry where you
usually don't have a "user". On the docker hub a "user" is also
identified as a namespace, so for Ceph the user was "ceph".

Variables have been simplified with only:

* ceph_docker_image
* ceph_docker_image_tag

1. For docker hub images: ceph_docker_name: "ceph/daemon" will give
you the 'daemon' image of the 'ceph' user.

2. For non docker hub images: ceph_docker_name: "daemon" will simply
give you the "daemon" image.

Infrastructure playbooks have been modified as well.
The file group_vars/all.docker.yml.sample has been removed as well.
It is hard to maintain since we have to generate it manually. If
you want to configure specific variables for a specific daemon simply
edit group_vars/$DAEMON.yml

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1420207
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-09 17:57:18 +01:00
Ivan Font 0298354137 Update to use consistent docker extra env vars
This playbook was still referencing the old version of the
ceph_*_docker_extra_env but only for Ceph MONs and Ceph NFS. This
playbook was not kept up-to-date when updating the
ceph_*_docker_extra_env variables to add the '-e' option to docker.
That's because the addition of '-e' breaks this playbook as it requires
a comma separated list of variables for the 'env:' docker module
parameter. Therefore this change just makes the playbook consistently
broken by referencing the same variable throughout.
2017-01-26 15:57:34 -08:00
Guillaume Abrioux a680707f6f All `include_vars` need to have `*.yml`, `*.yaml` or `*.json` extension.
As introduced in the following PR:
- https://github.com/ansible/ansible/pull/17207
we need to refactor our code.
2016-11-24 14:03:49 +01:00
Sébastien Han a2fcd222d2 moving to ansible v2.2 compatibility
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Julien Francoz julien@francoz.net
2016-11-04 10:09:38 +01:00
Eduard Egorov e5473ee565 Fix typos
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
2016-11-01 12:29:21 +00:00
Eduard Egorov 3652bb708b Fix rbd-mirrors group name
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
2016-11-01 12:21:47 +00:00
Eduard Egorov 645b5efebf Fix hard-coded host group names in include tasks for group variables' file paths.
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
2016-11-01 12:21:40 +00:00
Sébastien Han b0989c700f rolling_update: fix wrong indent
Fixing: https://bugzilla.redhat.com/show_bug.cgi?id=1388295
Also add some notes in the README on how to run infrastructure
playbooks.

Signed-off-by: Sébastien Han <seb@redhat.com>
2016-10-26 12:51:08 -05:00
Sébastien Han b8158a6554 ability to switch from bare metal to containerized daemons
Signed-off-by: Sébastien Han <seb@redhat.com>
2016-09-21 18:07:50 +02:00
Sébastien Han 5bfa1b0d24 ability to switch from bare metal to containerized daemons
Signed-off-by: Sébastien Han <seb@redhat.com>
2016-09-21 14:46:57 +02:00