ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	95a073cb3b	ceph-prometheus: update pool stat counter Since [1] The bytes_used pool counter in prometheus has been renamed to stored. Closes: #5781 [1] https://github.com/ceph/ceph/commit/71fe9149 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e54b924eaf`)	2020-09-16 10:08:46 -04:00
Dimitri Savineau	aaf1139242	switch2container: chown symlink for devices If the OSD directory is using symlinks for referencing devices (like block, db, wal for bluestore and journal for filestore) then the chown command could fail to change the owner:group on some system. $ ls -hl /var/lib/ceph/osd/ceph-0/ total 28K lrwxrwxrwx 1 ceph ceph 92 Sep 15 01:53 block -> /dev/ceph-45113532-95ca-471b-bd75-51de46f1339c/osd-data-570a1aee-60c0-44c9-8036-ffed7d67a4e6 -rw------- 1 ceph ceph 37 Sep 15 01:53 ceph_fsid -rw------- 1 ceph ceph 37 Sep 15 01:53 fsid -rw------- 1 ceph ceph 55 Sep 15 01:53 keyring -rw------- 1 ceph ceph 6 Sep 15 01:53 ready -rw------- 1 ceph ceph 3 Sep 15 02:00 require_osd_release -rw------- 1 ceph ceph 10 Sep 15 01:53 type -rw------- 1 ceph ceph 2 Sep 15 01:53 whoami $ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} + chown: cannot dereference './block': Permission denied $ find /var/lib/ceph/osd/ceph-0 -not -user 167 /var/lib/ceph/osd/ceph-0/block Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da4280e243`)	2020-09-15 15:30:12 -04:00
Dimitri Savineau	8757fdfb4a	switch2container: remove deb systemd units When running the switch2container playbook on a Debian based system then the systemd unit path isn't the same than Red Hat based system. Because the systemd unit files aren't removed then the new container systemd unit isn't take in count. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c1af69a7e7`)	2020-09-15 15:30:12 -04:00
Guillaume Abrioux	edcdbe5601	purge: remove potential socket leftover This commit ensure we remove any socket left by ceph and the `ceph-osd-run.sh` script. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1861755 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5e91e0f3e2`)	2020-09-14 16:50:49 -04:00
Guillaume Abrioux	4b55dcdc0d	tests: do not run node_exporter test on clients We need to skip these tests on client nodes since we don't deploy node_exporter on them anymore Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5650a6d7d0`)	2020-09-14 16:13:11 -04:00
Dimitri Savineau	25ba7f5314	node-exporter: exclude client nodes We don't need to install node-exporter on client node because there's no ceph services running on them. This also makes sure we use the group name variables in the prometheus service template instead of hardcoding the values. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b105549ed8`)	2020-09-14 16:13:11 -04:00
Dimitri Savineau	24698e7f4b	dashboard: use run_once at block level Instead of using run_once: true on each tasks in a block section, we can use the run_once statement at the block level. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2c4af70abd`)	2020-09-14 15:54:22 -04:00
Dimitri Savineau	23522a11e4	ceph_key: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `abb4023d76`)	2020-09-14 15:37:56 -04:00
Dimitri Savineau	e785654632	ceph_pool: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3a05aeb6cb`)	2020-09-14 15:37:56 -04:00
Dimitri Savineau	35b488c189	tests/library: rename ceph_dashboard_user class Rename the test class with the right information. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3ba11c1434`)	2020-09-14 15:33:46 -04:00
Dimitri Savineau	6fd6b31305	library: add ceph_dashboard_user module This adds the ceph_dashboard_user ansible module for replacing the command module usage with the ceph dashboard ac-user-xxx command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ee6f0547ba`)	2020-09-11 09:08:56 -04:00
Guillaume Abrioux	828817489c	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f0fe193d8e`)	2020-09-10 22:52:38 -04:00
Dimitri Savineau	593264e5f7	ceph-rgw: use ceph_pool module Since [1] we can use the ceph_pool module instead of using the command module combined with ceph osd pool commands. [1] `bddcb439ce` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8dacbce68f`)	2020-09-10 21:44:19 -04:00
Dimitri Savineau	7745fd3560	container: run engine/common roles on first client We already do this in the site-container.yml playbook because we don't need docker/podman installed on all client nodes and having the container image only on the first client node. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8ecbdc6ede`)	2020-09-10 20:57:16 +02:00
Dimitri Savineau	0c0a930374	ceph-facts: only get fsid when monitor are present When running the rolling_update playbook with an inventory without monitor nodes defined (like external scenario) then we can't retrieve the cluster fsid from the running monitor. In this scenario we have to pass this information manually (group_vars or host_vars). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f63022dfec`)	2020-09-10 20:57:16 +02:00
Dimitri Savineau	dd05d8ba90	tests: use grafana from quay.io This changes the grafana container image regitry from docker.io to quay.io to avoid rate limit. This also adds the missing container image values for docker2podman and podman scenarios. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `98c9afceb9`)	2020-09-10 17:30:37 +02:00
Guillaume Abrioux	218aedaab6	tests: migrate to quay.ceph.io registry in order to avoid docker.io rate limiting Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2cbb7de3b2`)	2020-09-10 17:30:37 +02:00
Francesco Pantano	8e3ecfd869	Add --cluster option on ceph require-osd-release command On DCN environments, or when multiple ceph cluster are configured, we need to specify the cluster name before running the command or the rolling_update playbook will fail during minor updates. Closes: https://bugzilla.redhat.com/1876447 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `cb64df30b6`)	2020-09-09 14:54:19 +02:00
Francesco Pantano	8dd8675080	Fix hosts field in rolling_update playbook when mds are processed In the OSP context, during the rolling update the playbook fails with the following error: ''' ERROR! The field 'hosts' has an invalid value, which includes an undefined variable. The error was: list object has no element 0 ''' This PR just change the hosts field providing a valid mons group value. Closes: https://bugzilla.redhat.com/1876803 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `e65f9a5c72`)	2020-09-09 14:53:44 +02:00
Niko Smeds	a41c572785	Enable HAProxy backend checks for Ceph RGW Add the `check` option to server definitions to enable basic HAProxy health checks for Ceph RADOS gateway backends. Currently traffic will be forwarded to unhealthly `radosgw.service` servers. These changes resolve the issue. Signed-off-by: Niko Smeds nikosmeds@gmail.com (cherry picked from commit `a951c1a3f0`)	2020-09-02 09:54:52 -04:00
Guillaume Abrioux	3a8be20699	rolling_update: remove 'ignore_errors' There's no need to use `ignore_errors: true` on these tasks. Using a loop on the task stopping mon daemons allows us to avoid duplicating this task, the `ignore_errors` isn't needed here because it won't fail the playbook if one of the ID doesn't exist (shortname vs. fqdn) Using the right condition on the task starting the mgr daemon allows us to avoid using an `ignore_errors: true` as well. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cec994b973`)	2020-08-21 16:33:15 +02:00
Guillaume Abrioux	2cc7ed21f0	ceph_key: refact the code and minor fixes This commit refactors the code to remove a duplicate condition and it makes the `state: absent` code idempotent Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13e2311cbe`)	2020-08-21 13:56:11 +02:00
Guillaume Abrioux	f22078b16d	tests: add more coverage for test_ceph_key This commit adds more coverage regarding the testing of ceph_key module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `27ca884d99`)	2020-08-21 13:56:11 +02:00
Guillaume Abrioux	8fde5f7396	dashboard: refact admin user creation task this commit splits this task in order to avoid using a `shell` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `54d3e9650f`)	2020-08-21 13:55:54 +02:00
Dimitri Savineau	dcf915f597	tests: reenable nfs-ganesha testing This re-adds the nfs-ganesha testing in non containerized deployment. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6c11695fbe`)	2020-08-20 12:22:32 -04:00
George Shuklin	c0d98878ff	Make 'disable ssl for dashboard task' idempotent. This should reduce number of 'changed' tasks during convergence test. Signed-off-by: George Shuklin <george.shuklin@gmail.com> (cherry picked from commit `73d4bb6bd6`)	2020-08-20 17:16:45 +02:00
Rafał Wądołowski	21a37e23b3	Comment out ceph_custom_key Since there is a check if ceph_custom_key is defined, there is no reason to define it by default. Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com> (cherry picked from commit `55cd6e83e4`)	2020-08-20 13:43:44 +02:00
Guillaume Abrioux	4685b411de	tests: move erasure pool testing in lvm_osds This commit moves the erasure pool creation testing from `all_daemons` to `lvm_osds` so we can decrease the number of osd nodes we spawn so the OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based scenario is being tested. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8476beb5b1`)	2020-08-20 11:55:40 +02:00
John Fulton	489efd5689	Set default permission for prometheus config files Regardless of the outcome of Ansible 2.9.12 issue 71200 we can set a default permission for these files. Closes: https://github.com/ceph/ceph-ansible/issues/5677 Signed-off-by: John Fulton <fulton@redhat.com> (cherry picked from commit `95dee6f1ca`)	2020-08-18 18:04:17 -04:00
Guillaume Abrioux	81d116b0ac	shrink-mds: use mds_to_kill_hostname instead When using fqdn in inventory host file, this task will fail because the mds is registered with its shortname. It means we must use `mds_to_kill_hostname` in this task. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1869837 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `51c382677d`)	2020-08-18 15:09:57 -04:00
Guillaume Abrioux	3fad1677d6	infra: only install logrotate on right nodes For intsance, there is no need to install logrotate on clients nodes. This also ensure logrotate is installed only for containerized deployments since the packaging has an explicit dependency to logrotate Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8ed11ea3ee`)	2020-08-18 11:10:57 -04:00
Guillaume Abrioux	8f18411770	travis: enforce ansible-lint 4.2.0 Let's pin to 4.2.0 (because of ansible/ansible-lint/issues/966) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `04d77dcaeb`)	2020-08-18 16:37:51 +02:00
Dimitri Savineau	e9c6028eb9	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cb8f0237e1`)	2020-08-17 23:00:13 +02:00
Dimitri Savineau	8ebe813428	container: don't install the engine on all clients We only need the container engine to be installed on the first clients node in order to execute the pools/keys operation. We already do the same worflow with the ceph-container-common role which pull the ceph container image. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9805589ef9`)	2020-08-17 22:59:40 +02:00
Guillaume Abrioux	004155d407	purge-cluster: use sysfs method for unmapping rbd devices This way we keep consistency with purge-container-cluster.yml playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f77fa6e2a4`)	2020-08-17 09:50:08 -04:00
Ali Maredia	63d991dc3d	rgw: allow rgws to be concurrently with or without multisite Allows rgws in a ceph cluster to be run with multisite and without multisite at the same time. Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `5c1f4b1a1e`)	2020-08-17 13:56:45 +02:00
Guillaume Abrioux	2609da6ce7	infra: add missing tag This commit adds the missing `with_pkg` tag on the logrotate installation task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e1cb385740`)	2020-08-13 10:09:31 -04:00
Guillaume Abrioux	56d2b62e00	purge: import ceph-defaults in purge osd play Otherwise, `ceph_volume_debug` variable is undefined Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `33a544644a`)	2020-08-12 22:57:10 +02:00
Guillaume Abrioux	29d4c42f80	infra: add log rotation support (containers) This commit adds the log rotation support via logrotate in containerized deployments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f1aa6cea21`)	2020-08-12 22:57:10 +02:00
Guillaume Abrioux	8a7e4193db	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `448cc280b7`)	2020-08-12 22:57:10 +02:00
Guillaume Abrioux	223254e8bf	nfs: do not copy rgw keyring when `nfs_obj_gw` is true This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the cluster doesn't contain a rgw node, which can be the case given we are using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the deployment will fail trying to copy a key that doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `dd4b5b0328`)	2020-08-12 14:57:56 -04:00
raul	5fc3af5f4d	rgw: support 1+ rgw instance in `radosgw_frontend_port` Change the radosgw_frontend_port to take in account more than 1 RGW instance, in it's original form `radosgw_frontend_port: radosgw_frontend_port \| int`, it configured the 8080 port to all instances, with the following modification `radosgw_frontend_port: radosgw_frontend_port \| int + item\|int` we increase in 1 the port count. Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com> (cherry picked from commit `110eaf5f9f`)	2020-08-12 14:57:35 -04:00
Guillaume Abrioux	4369833008	tests: test iscsigw against stable build This commit makes the ci using stable build for testing iscsigw in stable-5.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-12 12:19:18 -04:00
Benoît Knecht	5d06c0eda9	purge-cluster: check if rbdmap exists When running `infrastructure-playbooks/purge-cluster.yml` twice, it fails the second time on the `ensure rbd devices are unmapped` task, because `rbdmap` isn't installed anymore at that point. This commit adds a check that ensures `rbdmap` is available, and skips the `ensure rbd devices are unmapped` task if it isn't. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `a57fd7a090`)	2020-08-06 12:01:50 -04:00
Kevin Coakley	92b400f433	Remove ceph-radosgw.target when switching to containerize daemons The task "remove old systemd unit file" under "switching from non-containerized to containerized ceph rgw" only removes the ceph-radosgw@.service file. The task should also remove the ceph-radosgw.target file, like the "remove old systemd unit files" tasks for the mons, mgrs, osds, etc, in order to clean up all of the unused systemd unit files. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `d19e6033b2`)	2020-08-06 11:43:12 -04:00
Guillaume Abrioux	bd3439db75	shrink_osd: remove osd data directory Otherwise it leaves an empty directory. When shrinking and redeploying multiple OSDs you have no guarantee it will reuse the same osd id. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8933bfde33`)	2020-08-06 13:09:38 +02:00
Guillaume Abrioux	8632db7cb8	tests: refact shrink_osd scenario This adds more coverage on the shrink_osd scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7efea219d6`)	2020-08-06 13:09:38 +02:00
Guillaume Abrioux	d4dc674fa4	tox: split shrink_osd scenario Let's split this scenario with a dedicated tox ini file. This is for testing in two ways: 1/ shrinking OSDs one by one 2/ shrinking multiple OSDs with a single call of the playbook ceph-build related PR: ceph/ceph-build#1629 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `78e4faf077`)	2020-08-06 13:09:38 +02:00
Benoît Knecht	ccefe7da9f	shrink-osd: various fixes This handles missing /etc/ceph/osd, by ensuring we actually found files in `/etc/ceph/osd` before trying to slurp their content. This also add a missing `\| default(False)` to avoid fowlloing error: ``` fatal: [ceph01]: FAILED! => msg: \|- The conditional check 'ceph_osd_data_json[item.2]['encrypted'] \| bool' failed. The error was: error while evaluating conditional (ceph_osd_data_json[item.2]['encrypted'] \| bool): 'dict object' has no attribute 'encrypted' ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1862416 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `fe8fbd3ee2`)	2020-08-06 13:09:38 +02:00
Dimitri Savineau	92a2a2cf32	pytest: register ceph_crash mark Otherwise we see some pytest warning. PytestUnknownMarkWarning: Unknown pytest.mark.ceph_crash - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `03d4620269`)	2020-08-06 09:41:54 +02:00

1 2 3 4 5 ...

5359 Commits (95a073cb3b54022db46da662d92f48250ba50507) All Branches Search

5359 Commits (95a073cb3b54022db46da662d92f48250ba50507)

All Branches