ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	eefe11d90c	defaults: change default grafana-server name This change default value of grafana-server group name. Adding some tasks in ceph-defaults in order to keep backward compatibility. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-29 07:42:26 +02:00
Guillaume Abrioux	5e91e0f3e2	purge: remove potential socket leftover This commit ensure we remove any socket left by ceph and the `ceph-osd-run.sh` script. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1861755 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-14 15:34:49 -04:00
Guillaume Abrioux	f77fa6e2a4	purge-cluster: use sysfs method for unmapping rbd devices This way we keep consistency with purge-container-cluster.yml playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-17 09:28:12 +02:00
Guillaume Abrioux	33a544644a	purge: import ceph-defaults in purge osd play Otherwise, `ceph_volume_debug` variable is undefined Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Guillaume Abrioux	448cc280b7	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Benoît Knecht	a57fd7a090	purge-cluster: check if rbdmap exists When running `infrastructure-playbooks/purge-cluster.yml` twice, it fails the second time on the `ensure rbd devices are unmapped` task, because `rbdmap` isn't installed anymore at that point. This commit adds a check that ensures `rbdmap` is available, and skips the `ensure rbd devices are unmapped` task if it isn't. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-08-06 09:35:03 +02:00
Guillaume Abrioux	c2e507b42d	purge-cluster: replace shell by command in a task There is no need to use `shell` here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-05 09:37:41 +02:00
Guillaume Abrioux	9d2f2108e1	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 20:22:12 +02:00
Dimitri Savineau	c95adc564b	facts: explicitly disable facter and ohai By default, ansible gathers facts from facter and ohai if installed on the remote nodes, given we don't need them, let's exclude these facts from our facts gathering Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-02 17:46:12 +02:00
Dimitri Savineau	9d3b49293d	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	100e3a044e	purge-cluster: update package list to remove We only support python3 so renaming all ceph python packages. Some ceph packages were missing from the list (ceph-mon, ceph-osd or rbd-mirror) or didn't exist anymore (ceph-fs-common, libcephfs1). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 11:33:15 +01:00
wujie1993	d8b0b3cbd9	purge: fix purge cluster failed Fix purge cluster failed when local container images does not exist. Purge node-exporter and grafana-server only when dashboard_enabled is set to True. Signed-off-by: wujie1993 qq594jj@gmail.com	2020-01-31 12:09:46 -05:00
Dimitri Savineau	89f6cc54a2	purge-cluster: add podman support The podman support was added to the purge-container-cluster playbook but containers are always used for the dashboard even on non containerized deployment. This commits adds the podman support on purging the dashboard resources in the purge-cluster playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-04 14:15:12 -05:00
Guillaume Abrioux	0441812959	purge/update: remove backward compatibility legacy This was introduced in 3.1 and marked as deprecation We can definitely drop it in stable-4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-27 10:27:43 -05:00
Harald Jensås	e695efcaf7	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com>	2019-09-27 10:11:53 +02:00
Dimitri Savineau	07c6695d16	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-26 16:09:23 -04:00
Guillaume Abrioux	20e4852888	purge: ensure no ceph kernel thread is present This tries to first unmount any cephfs/nfs-ganesha mount point on client nodes, then unmap any mapped rbd devices and finally it tries to remove ceph kernel modules. If it fails it means some resources are still busy and should be cleaned manually before continuing to purge the cluster. This is done early in the playbook so the cluster stays untouched until everything is ready for that operation, otherwise if you try to redeploy a cluster it could end up by getting confused by leftover from previous deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1337915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-24 10:05:11 +02:00
Dimitri Savineau	44c63903ca	purge-cluster: clean all ceph repo files We currently only purge rh_storage yum repository file but depending on the ceph_repository value we are using, the ceph repository file could have a different name. Resolves: #4056 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-07 09:28:14 +02:00
guihecheng	59e702ec39	Add section for purging rgw loadbalancer in purge-cluster.yml Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>	2019-06-06 17:12:04 +02:00
L3D	ab54fe20ec	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de>	2019-06-06 10:21:17 +02:00
Dimitri Savineau	7503098ca0	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-03 13:35:50 +02:00
Boris Ranto	2f141a6e80	Merge cephmetrics/dashboard-ansible repo This commit will merge dashboard-ansible installation scripts with ceph-ansible. This includes several new roles to setup ceph-dashboard and the underlying technologies like prometheus and grafana server. Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com> Co-authored-by: Zack Cerza <zcerza@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-16 16:39:13 +02:00
Rishabh Dave	739a662c80	improve coding style Keywords requiring only one item shouldn't express it by creating a list with single item. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-23 15:37:07 +02:00
Dimitri Savineau	eb658b3af6	purge-cluster: remove python-ceph-argparse package When using purge-cluster playbook with nautilus, there's still the python-ceph-argparse package installed on the host preventing to reinstall a ceph cluster with a different version (like luminous or mimic) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-15 09:15:08 +02:00
Guillaume Abrioux	a1254d767c	purge: remove references to ceph-disk as of stable-4.0, ceph-disk is no longer supported. These tasks aren't needed anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-11 11:57:02 -04:00
Guillaume Abrioux	4d35e9eeed	osd: remove variable osd_scenario As of stable-4.0, the only valid scenario is `lvm`. Thus, this makes this variable useless. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-11 11:57:02 -04:00
Guillaume Abrioux	0180738313	purge: fix lvm-batch purge osd `lvm_volumes` and/or `devices` variable(s) can be undefined depending on the scenario chosen. These tasks should be run only if these variable are defined, otherwise it ends up with undefined variable errors. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1653307 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-02 14:56:24 +02:00
Guillaume Abrioux	f55e2b08be	remove all NBSPs on master branch Similar to #3658 Since there's too many changes between master and stable branches let's commit directly in each branches instead of trying to backport this commit. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-28 11:57:55 +00:00
Guillaume Abrioux	47ebef374f	purge: fix rbd-mirror group name the default is rbdmirrors in ceph-defaults Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-01 20:31:14 +00:00
Guillaume Abrioux	a915308477	purge: fix rbd mirror purge as of `b70d54ac80` the service launched isn't ceph-rbd-mirror@admin.service. it's now `ceph-rbd-mirror@rbd-mirror.{{ ansible_hostname }}` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-01 20:31:14 +00:00
Guillaume Abrioux	3849f30f58	purge: do not remove /var/lib/apt/lists/* removing the content of this directory seems a bit agressive and cause a redeployment to fail after a purge on debian based distrubition. Typical error: ``` fatal: [mon0]: FAILED! => changed=false attempts: 3 msg: No package matching 'ceph' is available ``` The following task will consider the cache is still valid, so apt doesn't refresh it: ``` - name: update apt cache if cache_valid_time has expired apt: update_cache: yes cache_valid_time: 3600 register: result until: result is succeeded ``` since the task installing ceph packages has a `update_cache: no` it fails: ``` - name: install ceph for debian apt: name: "{{ debian_ceph_pkgs \| unique }}" update_cache: no state: "{{ (upgrade_ceph_packages\|bool) \| ternary('latest','present') }}" default_release: "{{ ceph_stable_release_uca \| default('') }}{{ ansible_distribution_release ~ '-backports' if ceph_origin == 'distro' and ceph_use_distro_backports else '' }}" register: result until: result is succeeded ``` /tmp/* isn't specific to ceph as well, so we shouldn't remove everything in this directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-01 20:31:14 +00:00
Guillaume Abrioux	89f77589fa	purge: fix purge of lvm devices using `shell` module seems to be the only way to make this task working on rhel based distribution AND debian based distributions. on ubuntu, using `command` ansible module fails like following (not due to `sudo` usage or not): ``` ok: [osd1] => changed=false cmd: command -v ceph-volume failed_when_result: false msg: '[Errno 2] No such file or directory: ''command'': ''command''' rc: 2 ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1653307 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-01 20:31:14 +00:00
guihecheng	1ac94c048f	rgw: add support for multiple rgw instances on a single host With this, we could have multiple rgw instances on a single host with a single run, don't have to use rgw-standalone.yml which does not seems able to bind ports separately. If you want to have multiple rgw instances, just change 'radosgw_instances' to the number you want, which defaults to 1. Not compatible with Multi-Site yet. Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>	2019-01-18 11:12:28 +01:00
Andrew Schoen	ffd56177e7	purge-cluster: skip tasks that use ceph-volume if it's not installed This will allow the playbook to be idempotent. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1656935 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-12-13 11:27:27 +01:00
Sébastien Han	fc6ebd8ebb	purge-cluster: add support for mon/mgr collocation Recently we introduced the default collocation of mon/mgr without the need of a dedicated mgrs section. This means we have to stop the mgr process on that machine too. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-11 09:59:25 +01:00
Sébastien Han	3a154fa0ad	purge-cluster: remove support for other init system We only support systemd and use the service module anyway. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-11 09:59:25 +01:00
Rishabh Dave	da6f384223	don't loop over a task using package management modules For tasks using (Ansible) modules for package management utilities, pass the list of packages to be installed instead of repeating the task for each package. Using the latter manner of installing a list of packages leads to a deprecation warning by ansible-playbook command. Fixes: https://github.com/ceph/ceph-ansible/issues/3293 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-11-08 08:38:10 +00:00
Rishabh Dave	640cad3fd8	remove configuration files for ceph packages on ubuntu clusters For apt-get, purge command needs to be used, instead of remove command, to remove related configuration files. Otherwise, packages might be shown as installed while running dpkg command even after removing them. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1640061 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-11-07 15:52:53 +01:00
Guillaume Abrioux	f7d4651186	playbook: remove jinja syntax in when statement this syntax in deprecated Closes: #3281 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-31 13:45:41 +01:00
Guillaume Abrioux	d8d3e55006	remove restapi role As of `mimic`, restapi is no longer available because of manager daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:19:13 +01:00
Andrew Schoen	9747f3dbd5	purge-cluster: zap devices used with the lvm scenario Fixes: https://github.com/ceph/ceph-ansible/issues/3156 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-09-28 14:49:56 +02:00
wumingqiao	5da71e1ca1	purge-cluster: recursively remove ceph-related files, symlinks and directories under /etc/systemd/system. fix: https://github.com/ceph/ceph-ansible/issues/3166 Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>	2018-09-28 14:49:22 +02:00
Guillaume Abrioux	144c92b21f	purge: actually remove of /var/lib/ceph/* `38dc20e74b` introduced a bug in the purge playbooks because using `` in `command` module doesn't work. `/var/lib/ceph/` files are not purged it means there is a leftover. When trying to redeploy a cluster, it failed because monitor daemon was detecting existing keyring, therefore, it assumed a cluster already existed. Typical error (from container output): ``` Sep 26 13:18:16 mon0 docker[31316]: 2018-09-26 13:18:16 /entrypoint.sh: Existing mon, trying to rejoin cluster... Sep 26 13:18:16 mon0 docker[31316]: 2018-09-26 13:18:16.9323937f15b0d74700 -1 auth: unable to find a keyring on /etc/ceph/test.client.admin.keyring,/etc/ceph/test.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:(2) No such file or directory Sep 26 13:18:23 mon0 docker[31316]: 2018-09-26 13:18:23 /entrypoint.sh: SUCCESS ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1633563 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-27 17:45:21 +02:00
Sébastien Han	38dc20e74b	purge: only purge /var/lib/ceph content Sometime /var/lib/ceph is mounted on a device so we won't be able to remove it (device busy) so let's remove its content only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1615872 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-03 10:51:24 +02:00
Jeffrey Zhang	85cc61a6d9	Use /var/lib/ceph/osd folder to filter osd mount point In some case, use may mount a partition to /var/lib/ceph, and umount it will be failure and no need to do so too. Signed-off-by: Jeffrey Zhang <zhang.lei.fly@gmail.com>	2018-08-14 13:00:24 +00:00
Guillaume Abrioux	9801bde4d4	purge_cluster: fix dmcrypt purge dmcrypt devices aren't closed properly, therefore, it may fail when trying to redeploy after a purge. Typical errors: ``` ceph-disk: Cannot discover filesystem type: device /dev/sdb1: Command '/sbin/blkid' returned non-zero exit status 2 ``` ``` ceph-disk: Error: unable to read dm-crypt key: /var/lib/ceph/osd-lockbox/c6e01af1-ed8c-4d40-8be7-7fc0b4e104cf: /etc/ceph/dmcrypt-keys/c6e01af1-ed8c-4d40-8be7-7fc0b4e104cf.luks.key ``` Closing properly dmcrypt devices allows to redeploy without error. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-21 08:23:10 +02:00
Guillaume Abrioux	a9247c4de7	purge_cluster: wipe all partitions In order to ensure there is no leftover after having purged a cluster, we must wipe all partitions properly. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-17 08:37:17 +02:00
Guillaume Abrioux	9cad113e2f	purge_cluster: fix bug when building device list there is some leftover on devices when purging osds because of a invalid device list construction. typical error: ``` changed: [osd3] => (item=/dev/sda sda1) => { "changed": true, "cmd": "# if the disk passed is a raw device AND the boot system disk\n if parted -s \"/dev/sda sda1\" print \| grep -sq boot; then\n echo \"Looks like /dev/sda sda1 has a boot partition,\"\n echo \"if you want to delete specific partitions point to the partition instead of the raw device\"\n echo \"Do not use your system disk!\"\n exit 1\n fi\n echo sgdisk -Z \"/dev/sda sda1\"\n echo dd if=/dev/zero of=\"/dev/sda sda1\" bs=1M count=200\n echo udevadm settle --timeout=600", "delta": "0:00:00.015188", "end": "2018-05-16 12:41:40.408597", "item": "/dev/sda sda1", "rc": 0, "start": "2018-05-16 12:41:40.393409" } STDOUT: sgdisk -Z /dev/sda sda1 dd if=/dev/zero of=/dev/sda sda1 bs=1M count=200 udevadm settle --timeout=600 STDERR: Error: Could not stat device /dev/sda sda1 - No such file or directory. ``` the devices list in the task `resolve parent device` isn't built properly because the command used to resolve the parent device doesn't return the expected output eg: ``` changed: [osd3] => (item=/dev/sda1) => { "changed": true, "cmd": "echo /dev/$(lsblk -no pkname \"/dev/sda1\")", "delta": "0:00:00.013634", "end": "2018-05-16 12:41:09.068166", "item": "/dev/sda1", "rc": 0, "start": "2018-05-16 12:41:09.054532" } STDOUT: /dev/sda sda1 ``` For instance, it will result with a devices list like: `['/dev/sda sda1', '/dev/sdb', '/dev/sdc sdc1']` where we expect to have: `['/dev/sda', '/dev/sdb', '/dev/sdc']` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-17 08:37:17 +02:00
Andrew Schoen	08f4875533	ceph_volume: refactor to not run ceph osd destroy This changes state to action and gives the options 'create' or 'zap'. The zap parameter is also removed. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-04-10 14:19:21 +02:00
Andrew Schoen	c6e8f8fb11	purge-cluster: no need to use objectstore for ceph_volume module When zapping objectstore is not required. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-04-10 14:19:21 +02:00

1 2 3

120 Commits (8f5db079ae416becdc327c5e80a040997be731fa)