ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	5db0b239f6	purge: use sysfs to unmap rbd devices in containerized context, using the binary provided in atomic os won't work because it's an old version provided by ceph-common based on 10.2.5. Using a container could be an idea but for large cluster with hundreds of client nodes, that would require to pull the image of each of them just to unmap the rbd devices. Let's use the sysfs method in order to avoid any issue related to ceph version that is shipped on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3cfcc7a105`)	2020-01-13 14:50:29 -05:00
Guillaume Abrioux	8b91905dff	purge: ensure no ceph kernel thread is present This tries to first unmount any cephfs/nfs-ganesha mount point on client nodes, then unmap any mapped rbd devices and finally it tries to remove ceph kernel modules. If it fails it means some resources are still busy and should be cleaned manually before continuing to purge the cluster. This is done early in the playbook so the cluster stays untouched until everything is ready for that operation, otherwise if you try to redeploy a cluster it could end up by getting confused by leftover from previous deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1337915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `20e4852888`)	2019-06-24 15:36:21 +02:00
Guillaume Abrioux	7136f1734e	purge: fix lvm-batch purge osd `lvm_volumes` and/or `devices` variable(s) can be undefined depending on the scenario chosen. These tasks should be run only if these variable are defined, otherwise it ends up with undefined variable errors. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1653307 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0180738313`)	2019-04-03 08:48:39 +02:00
Dimitri Savineau	8e2cfd9d24	purge-docker-cluster: Remove ceph-osd service The systemd ceph-osd@.service file used for starting the ceph osd containers is used in all osd_scenarios. Currently purging a containerized deployment using the lvm scenario didn't remove the ceph-osd systemd service. If the next deployment is a non-containerized deployment, the OSDs won't be online because the file is still present and override the one from the package. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7cc626b72d`)	2019-04-01 09:10:29 +00:00
Guillaume Abrioux	416b503476	introduce new role ceph-facts sometimes we play the whole role `ceph-defaults` just to access the default value of some variables. It means we play the `facts.yml` part in this role while it's not desired. Splitting this role will speedup the playbook. Closes: #3282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0eb56e36f8`)	2019-01-07 09:14:10 +01:00
Guillaume Abrioux	c3bb76b8e9	purge-container: move facts gathering after ceph-defaults role import This task has to be called after the role `ceph-defaults` has been played, otherwise, `mon_group_name` will never be known. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a12de3e048`)	2019-01-07 09:14:10 +01:00
Guillaume Abrioux	b9bf7c6703	purge-container: fix wrong syntax we want a default value for `mon_group_name`, not for `groups[mon_group_name]`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d0b3cb7f85`)	2019-01-07 09:14:10 +01:00
Guillaume Abrioux	0ff1260fc1	purge-docker: do not call ceph-osd role calling ceph-osd role in purge playbook is not needed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ae7f3d66a6`)	2019-01-07 09:14:10 +01:00
Guillaume Abrioux	c405fd1140	purge: gather monitors facts in OSD purge the OSD part of the purge delegates commands on monitor node, we need to gather monitors facts to know the `ansible_hostname` fact that is used in the `docker_exec_cmd` fact. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1a4a6ec855`)	2019-01-07 09:14:10 +01:00
Sébastien Han	37ba313d76	purge-container: gather fact before calling ceph-defaults ceph-defaults relies on facts so we must gather facts before running it. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `62111ff53c`)	2019-01-07 09:14:10 +01:00
Sébastien Han	782959f094	purge-docker-cluster: add support for mgr/mon collocation Recently we introduced the collocation of mon and mgr by default, so we don't need to have an explicit mgrs section for this. This means we have to remove the mgr container on the mon machines too. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `325a159415`) # Conflicts: # infrastructure-playbooks/purge-docker-cluster.yml	2019-01-07 09:14:10 +01:00
Sébastien Han	8ce8d580a4	purge-docker-cluste: add a task to check hosts It's useful when running on CI to see what might remain on the machines. So we list all the containers and images. We expect the list to be empty. We fail if we see containers running. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `2bcc00896f`)	2019-01-07 09:14:10 +01:00
Sébastien Han	f37c21a9d0	purge-docker-cluster: add ceph-volume support This commits adds the support for purging cluster that were deployed with ceph-volume. It also separates nicely with a block intruction the work to do when lvm is used or not. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `1751885bc9`)	2019-01-07 09:14:10 +01:00
Guillaume Abrioux	e37a90b5ec	purge: add iscsi support add iscsi support for both non containerized and containerized deployment in purge playbooks. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `78116fa6db`)	2018-12-04 18:04:13 +01:00
Guillaume Abrioux	144c92b21f	purge: actually remove of /var/lib/ceph/* `38dc20e74b` introduced a bug in the purge playbooks because using `` in `command` module doesn't work. `/var/lib/ceph/` files are not purged it means there is a leftover. When trying to redeploy a cluster, it failed because monitor daemon was detecting existing keyring, therefore, it assumed a cluster already existed. Typical error (from container output): ``` Sep 26 13:18:16 mon0 docker[31316]: 2018-09-26 13:18:16 /entrypoint.sh: Existing mon, trying to rejoin cluster... Sep 26 13:18:16 mon0 docker[31316]: 2018-09-26 13:18:16.9323937f15b0d74700 -1 auth: unable to find a keyring on /etc/ceph/test.client.admin.keyring,/etc/ceph/test.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:(2) No such file or directory Sep 26 13:18:23 mon0 docker[31316]: 2018-09-26 13:18:23 /entrypoint.sh: SUCCESS ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1633563 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-27 17:45:21 +02:00
Sébastien Han	38dc20e74b	purge: only purge /var/lib/ceph content Sometime /var/lib/ceph is mounted on a device so we won't be able to remove it (device busy) so let's remove its content only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1615872 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-03 10:51:24 +02:00
Guillaume Abrioux	d0746e0858	common: switch from docker module to docker_container As of ansible 2.4, `docker` module has been removed (was deprecated since ansible 2.1). We must switch to `docker_container` instead. See: https://docs.ansible.com/ansible/latest/modules/docker_module.html#docker-module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 20:08:07 +00:00
Randy J. Martinez	d1f2d64b15	purge-docker: added conditionals needed to successfully re-run purge Added 'ignore_errors: true' to multiple lines which run docker commands; even in cases where docker is no longer installed. Because of this, certain tasks in the purge-docker-cluster.yml will cause the playbook to fail if re-run and stop the purge. This leaves behind a dirty environment, and a playbook which can no longer be run. Fix Regex line 275: Sometimes 'list-units' will output 4 spaces between loaded+active. The update will account for both scenarios. purge fetch_directory: in other roles fetch_directory is hard linked ex.: "{{ fetch_directory }}"/"{{ somedir }}". That being said, fetch_directory will never have a trailing slash in the all.yml so this task was never being run(causing failures when trying to re-deploy). Signed-off-by: Randy J. Martinez <ramartin@redhat.com>	2018-04-10 13:39:14 +02:00
Guillaume Abrioux	e32a177af8	purge-docker: remove redundant task The `remove_packages` prompt is redundant to the `ireallymeanit` prompt since it does exactly the same thing. I guess the only goal of this task was to make a break to warn user about `--skip-tags=with_pkg` feature. This warning should be part of the first prompt. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-03 11:54:42 +02:00
jtudelag	691f7c5146	Adds handy ceph aliases whe containerized installations. Same approach as openshift-ansible etcdctl: * https://github.com/openshift/openshift-ansible/blob/release-3.7/roles/etcd/tasks/auxiliary/drop_etcdctl.yml * https://github.com/openshift/openshift-ansible/blob/release-3.7/roles/etcd/etcdctl.sh	2018-03-08 13:56:39 +01:00
Guillaume Abrioux	3b2f6c34e4	purge-docker: fix ceph-osd-zap name container the `zap ceph osd disks` task should iter on `resolved_parent_device` instead of `combined_devices_list` which contain only the base device name (vs. full path name in `combined_devices_list`). this fixes the issue where docker complain about container name because of illegal characters such as `/` : ``` "/usr/bin/docker-current: Error response from daemon: Invalid container name (ceph-osd-zap-magna074-/dev/sdb1), only [a-zA-Z0-9][a-zA-Z0-9_.-] are allowed.","See '/usr/bin/docker-current run --help'." "" ``` having the the basename of the device path is enough for the container name. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1540137 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-02 22:09:11 +01:00
Guillaume Abrioux	f372a4232e	purge: fix resolve parent device task This is a typo caused by leftover. It was previously written like this : `shell: echo /dev/$(lsblk -no pkname "{{ item }}") }}")` and has been rewritten to : `shell: $(lsblk --nodeps -no pkname "{{ item }}") }}")` because we are appending later the '/dev/' in the next task. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1540137 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-30 17:40:10 +01:00
Guillaume Abrioux	55298fa80c	purge-container: use lsblk to resolv parent device Using `lsblk` to resolv the parent device is better than just removing the last char when passing it to the zap container. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-17 15:54:20 +01:00
Guillaume Abrioux	58eb045d2f	purge-container: remove awk usage in favor of blkid Avoid using `awk` to get the different devices from the partlabel. Using `blkid` is more readable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-17 15:54:20 +01:00
Guillaume Abrioux	d9c1b61092	purge-docker: remove osd disk prepare logs `with_fileglob` loops over files on the machine that runs the playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-16 14:27:36 +01:00
Guillaume Abrioux	fa675f2ead	purge-docker-cluster: ensure old logs are removed purge-docker-cluster must remove all osd_disk_prepare logs in `{{ ceph_osd_docker_run_script_path }}`, otherwise if you purge your cluster and try to redeploy it, osds will fail to start since because it will try to retrieve find a partition uuid which doesn't exist. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510470 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-09 17:49:20 +01:00
Guillaume Abrioux	f90f2f3a04	purge: containers are not stopped During purge osd, the containers are not stopped because of a typo, as a result, all the devices can't be unmounted later. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-25 07:58:00 +02:00
Major Hayden	33b200d43a	Suppress yum/dnf/rpm command warnings Ansible throws warnings when using yum/dnf/rpm with the command module: [WARNING]: Consider using yum module rather than running yum This patch adds the `warn: no` argument to suppress the warnings in the Ansible output.	2017-10-12 08:38:05 -05:00
Sébastien Han	c693e95cbf	purge-docker: rework device detection we don't need "devices" and other device variable anymore, the playbook detects that for us. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:04 +02:00
Sébastien Han	3bd341f6c0	osd: container use id instead of dev name Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1494127 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 14:44:00 +02:00
Guillaume Abrioux	fcb6454e04	rbd-mirror: fix systemd unit in purge-docker rbd-mirror containers are not stopped in purge-docker-cluster playbook because of the wrong name used. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 21:18:50 +02:00
Guillaume Abrioux	c80ba7a307	purge: implement mgr purge unti now, mgr nodes are not managed by purge-cluster.yml, therefore it breaks scenario like purge_cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 21:18:50 +02:00
Guillaume Abrioux	7195b08718	update: update rgw systemd unit name The old name is used in `rolling_update.yml` and `purge-docker-cluster.yml`, it breaks the `test_rgw_service_is_running()` test. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 14:58:55 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Sébastien Han	30991b1c0a	osd: simplify scenarios There is only two main scenarios now: * collocated: everything remains on the same device: - data, db, wal for bluestore - data and journal for filestore * non-collocated: dedicated device for some of the component Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-03 10:20:39 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Andrew Schoen	59992c54cc	purge-docker-cluster: include ceph_docker_registry We need to include ceph_docker_registry when removing containers/images because if we don't it will assume docker.io which is not always where the image originated from, causing the playbook to fail. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-02 09:49:17 -05:00
Andrew Schoen	f7677e4393	purge-docker-cluster: pip is only used on Debian We only need to purge packages installed by pip on Debian systems. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-31 09:03:44 -05:00
Andrew Schoen	8e322d4825	purge-docker-cluster: default raw_journal_devices to [] If we're purging a containerized cluster that did not use the raw_multi_journal OSD scenario then raw_journal_devices will not be defined which causes the playbook to fail. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1455187 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-25 07:30:25 -05:00
Sébastien Han	8320c14191	Merge pull request #1317 from ibotty/harmonize-docker-names harmonize docker names	2017-03-14 18:20:20 +01:00
Andrew Schoen	cf702b05cf	purge-docker-cluster: do not include role defaults or group vars Doing so at playbook level overrides whatever values might be set for these in the user's group_vars directory that's relative to their inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	7289acb6b3	purge-docker-cluster: do not set group names vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
Tobias Florek	931027e6f7	harmonize docker names Created containers now are named more or less in the form of <ansible role>-<ansible_hostname>	2017-02-23 09:15:05 +01:00
Sébastien Han	3b633d5ddc	purge-docker: re-implement zap devices We now run the container and waits until it dies. Prior to this we were stopping it before completion so not all the devices where zapped. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:56:09 -05:00
Sébastien Han	a002508a91	purge-docker: also purge journal devices Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:54:36 -05:00
Sébastien Han	c2f1dca823	docker: use a better method to pull images We changed the way we declare image. Prior to this patch we must have a "user/image:tag" format, which is incompatible with non docker-hub registry where you usually don't have a "user". On the docker hub a "user" is also identified as a namespace, so for Ceph the user was "ceph". Variables have been simplified with only: * ceph_docker_image * ceph_docker_image_tag 1. For docker hub images: ceph_docker_name: "ceph/daemon" will give you the 'daemon' image of the 'ceph' user. 2. For non docker hub images: ceph_docker_name: "daemon" will simply give you the "daemon" image. Infrastructure playbooks have been modified as well. The file group_vars/all.docker.yml.sample has been removed as well. It is hard to maintain since we have to generate it manually. If you want to configure specific variables for a specific daemon simply edit group_vars/$DAEMON.yml Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1420207 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-09 17:57:18 +01:00
Sébastien Han	c34d0a9d28	purge-docker: force image deletion even if non-runnin containers are using this image as a reference. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-07 22:14:21 +01:00
Guillaume Abrioux	a680707f6f	All `include_vars` need to have `.yml`, `.yaml` or `*.json` extension. As introduced in the following PR: - https://github.com/ansible/ansible/pull/17207 we need to refactor our code.	2016-11-24 14:03:49 +01:00
Eduard Egorov	3652bb708b	Fix rbd-mirrors group name Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>	2016-11-01 12:21:47 +00:00
Eduard Egorov	645b5efebf	Fix hard-coded host group names in include tasks for group variables' file paths. Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>	2016-11-01 12:21:40 +00:00

1 2

55 Commits (b107dcf80beb345ae2afae3f66423e93e2cf17d1)