ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	d65c7b4035	config: reset num_osds When collocating OSDs with other daemon, `num_osds` is incorrectly calculated because `ceph-config` is called multiple times. Indeed, the following code: ``` num_osds: "{{ lvm_list.stdout \| default('{}') \| from_json \| length \| int + num_osds \| default(0) \| int }}" ``` makes `num_osds` be incremented each time `ceph-config` is called. We have to reset it in order to get the correct number of expected OSDs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `31a0f2653d`)	2021-03-17 17:35:19 +01:00
Guillaume Abrioux	732e5b10b8	update: convert legacy grafana-server groupname early If the legacy name `grafana-server` is still being used when upgrading from Nautilus to Pacific, the task that sets the fact `rolling_update` to `true` doesn't run on the node(s) included in that group. Indeed the play where we set this fact (`rolling_update`) only runs on the group `monitoring_group_name \| default('monitoring')`. As a workaround, we can run earlier the task which converts the `grafana-server` group name to `monitoring`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935554 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6ccc8b4722`)	2021-03-16 14:33:40 +01:00
Matthew Vernon	3c8191194d	docs: Document the prepare_osd tag There are times where being able to skip OSD creation is useful to the admin (see #1777 for example), and skipping the prepare_osd tag is a way to achieve this. Document this fact. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `e66b7b7449`)	2021-03-12 09:19:55 +01:00
Matthew Vernon	6deb88d8fb	ceph-osd: add prepare_osd tag to lvm-batch scenario Sometimes it's useful to be able to skip the OSD creation step when running ceph-ansible (cf #1777). The lvm scenario has a prepare_osd tag on the relevant play. This commit adds the same tag to the lvm-batch scenario. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `88d119e95a`)	2021-03-12 09:19:55 +01:00
Matthew Vernon	6a23be19f4	Docs: fix some typos While working on the previous PR, I found a couple of typos in the docs. This fixes those. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `8b1474ab75`)	2021-03-11 22:04:53 +01:00
Matthew Vernon	1a67f59789	Fix typo and broken link for documenting RGW frontends http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace it with a working "pacific" docs link, and correct the spelling of "additional" while I'm at it. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `847611048e`)	2021-03-03 14:17:31 +01:00
Guillaume Abrioux	6832c8d7a5	tests: increase nb of rerun in pytest In order to avoid false positive in the CI that I've been unable to reproduce. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f7fd1c2298`)	2021-03-03 14:12:46 +01:00
Guillaume Abrioux	f42ed8e1e0	dashboard: add missing parameter in `ceph_cmd` the `ceph_cmd` fact is missing the `--net=host` parameter. Some tasks consuming this fact can fail like following: ``` Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f143b1a647`)	2021-03-03 14:12:46 +01:00
Florian Haas	95949ec787	requirements.txt: Move the six dependency into the general requirements config_template.py depends on six, which isn't listed in the default requirements.txt. This previously frequently wasn't a problem, because six used to be a standard package being installed into a venv, and lots of other projects depended on it. It also does get installed for unit and integration tests via tests/requirements.txt, so any broken dependency on six wouldn't be detected by tox runs. However, as other projects and distributions have phased out Python 2.7 support the dependency on six becomes less common. Thus, as long as ceph-ansible does require it for config_template.py, add it to the base requirements. Signed-off-by: Florian Haas <florian@citynetwork.eu> (cherry picked from commit `d49ea9818b`)	2021-03-01 15:16:55 +01:00
Guillaume Abrioux	accdcf78e6	defaults: update rhcs dashboard images versions The current dashboard images deployed have a bad health index. Updating to a newer version fixes this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925350 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a16ae693d8`)	2021-02-18 18:21:53 +01:00
Guillaume Abrioux	bb9bba685f	library: do not always add --yes in batch mode When asking `ceph-volume` to report only in `lvm batch` context, there's a bug described in bz1896803 [1] when `--yes` is passed (which by the way isn't necessary with `--report`). This commit ensure `--yes` isn't passed to `ceph-volume` when `--report` is used. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1896803 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896803 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe6d6ba622`)	2021-02-14 06:29:16 +01:00
Guillaume Abrioux	3326b6d54f	purge: rm service-cid files This commit makes sure purge playbooks remove those file if for any reason they have been left. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1920900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b9dd253a4f`)	2021-02-12 18:33:19 +01:00
Guillaume Abrioux	5803619a5d	switch2container: do not serialize the ceph-crash migration There's no need to slow down the playbook execution time by migrating all the `ceph-crash` instances in a serial way. Let's remove the `serial: 1` so the migration is achieved in a parallel way. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `980a5a7df4`)	2021-02-12 14:06:15 +01:00
Guillaume Abrioux	2feefdc861	tests: increase `mon_max_pg_per_osd` we aren't deploying enough OSD daemon, so it fails like following: ``` stderr: 'Error ERANGE: pool id 10 pg_num 256 size 2 would mean 1536 total pgs, which exceeds max 1500 (mon_max_pg_per_osd 250 * num_in_osds 6)' ``` Let's increase the value of `mon_max_pg_per_osd` in order to get around this issue in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `682116023d`)	2021-02-12 09:15:24 +01:00
Guillaume Abrioux	980a0dd00e	rolling_update: update specific pacific task update the 'require-osd-release' task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-12 09:15:24 +01:00
Guillaume Abrioux	7dd4a8a059	tests: use shaman to test against ceph pacific Given there's no pacific packages available at https://download.ceph.com, let's use shaman in order to test against Ceph Pacific Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-12 09:15:24 +01:00
Guillaume Abrioux	9102d6c090	doc: add a note about "latest" tags See the change for details. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4e95180c80`)	2021-02-11 16:41:50 +01:00
Dimitri Savineau	950a6ae406	cephadm-adopt: remove prometheus workaround This was fixed by [1][2] [1] https://tracker.ceph.com/issues/45120 [2] https://github.com/ceph/ceph/commit/252d4b30 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-10 13:51:41 +01:00
Dimitri Savineau	d42d584085	doc: update containerized deployment This adds more documentation to the configuration and usage of containerizerd deployment. Closes: #6198 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-10 13:50:53 +01:00
Guillaume Abrioux	7e5071856c	doc: update the documentation - mention `stable-6.0` requirements. - update some patterns. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 13:50:10 +01:00
Dimitri Savineau	48a456dc8c	rolling_update: enforce ceph-container-engine When running the rolling_update.yml playbook and adding the dashboard component in the same time then the requirement (like container packages) aren't installed. This could lead to a failure in case of using authentication on the container registry because the playbook will try to login on the registry but podman/docker aren't yet installed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1903504 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-10 08:17:11 +01:00
Dimitri Savineau	e4dd0067c6	ceph-common: enable rhcs tools repo for monitoring The monitoring node running grafana needs the rhcs tools repostory enabled in non containerized deployment to be able to install the ceph-grafana-dashboards rpm package. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-10 08:17:11 +01:00
Guillaume Abrioux	2f1d287b1c	tests: pin ansible-lint version This commit pins the ansible-lint version to 4.3.7 as ceph-ansible isn't compatible with recent changes in 5.0.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 07:48:24 +01:00
Guillaume Abrioux	54bae480d2	tests: set `mon_max_pg_per_osd` in rgw_multisite Otherwise, the job fails when it tries to create a bucket with `s3cmd mb` command because we have too many PGs per OSD. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 07:01:21 +01:00
Guillaume Abrioux	931b87e830	rgw: fix a typo in multisite if `rgw_zonegroupmaster` is not defined at the rgw instance level in `rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 07:01:21 +01:00
Dimitri Savineau	94af3c87d1	rolling_update: exclude clients from node-exporter Since `b105549` we don't install node-exporter on client nodes so we should also exclude the client node from the node-exporter upgrade. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-09 14:41:13 +01:00
Dimitri Savineau	58b101d9ff	docs: nautilus uses ansible 2.9 This updates the ansible release required to deploy nautilus with the stable-4.0 branch. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-09 12:46:38 +01:00
Dimitri Savineau	e7cdcfa342	dashboard: update with the new monitoring group Since `eefe11d` the grafana-server group has been renamed to monitoring but the dashboard playbook wasn't updated. This was still working due to the backward compatibility added in the ceph-facts role. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-09 12:45:02 +01:00
Dimitri Savineau	ed094ea07a	vagrant: remove centos/8 workaround The CentOS 8 vagrant box has finally been updated [1] with a recent version (the latest one 2011 which means CentOS 8.3). We don't need to download the vagrant libvirt box with a direct url anymore from the CentOS infrastructure. [1] https://app.vagrantup.com/centos/boxes/8 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-09 12:41:59 +01:00
Guillaume Abrioux	b9cdee40a2	update: update ceph release pattern in complete upgrade play since master is now deploying quincy, we must update this. Otherwise, it will fail like following: ``` Error EPERM: require_osd_release cannot be lowered once it has been set ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	39649f0ce8	common: ensure shaman returns right repo Due to recent changes in shaman, there's a chance it returns the wrong repository from architecture point of view. We can query shaman and ask for the correct architecture to get around this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	44fbadb50c	rolling_update: pg check refactor There's no need to achieve this in two tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	c1f627c465	validate: fix a typo fixes a typo Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	8eda590130	tests: remove legacy remove a legacy in tox environment definition Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	c3eadbc31a	tests: follow up on `7c9063b` `7c9063b1d2` broke some scenarios. This commit fixes them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Dimitri Savineau	8939dddff4	library: fix idempotency in ceph_mgr_module The ceph mgr command output is printed on stderr instead of stdout which prevent to set the changed flag to false if the module is already enabled. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 08:30:44 +01:00
Dimitri Savineau	76a663245d	cephadm-adopt: use ceph_osd_flag module There's no reason to not use the ceph_osd_flag module to set/unset osd flags. Also if there's no OSD nodes in the inventory then we don't need to execute the set/unset play. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 08:29:31 +01:00
Dimitri Savineau	36fc04eaab	purge-cluster: use parted ansible module Instead of doing some scripting via the shell module, we can use the parted ansible module to check the boot flag on partitions. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 08:28:22 +01:00
Dimitri Savineau	bc6948037f	library/cephadm_bootstrap: add registry support This adds the custom registry auth support when using a registry with authentication. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 08:27:28 +01:00
Dimitri Savineau	b1f37c4b3d	ceph-defaults: use https for download.ceph.com There's no reason to still use http on download.ceph.com instead of https. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 07:14:37 +01:00
Guillaume Abrioux	7c9063b1d2	tests: use lvm batch on osd2 (all_daemons) in order to test lvm batch in purge scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-02 17:24:17 +01:00
Guillaume Abrioux	984191ac7f	purge: zap and destroy db and wal devices for lvm batch Those devices (db/wal) are never zapped in lvm batch deployment. Iterating over `dedicated_devices` and `bluestore_wal_devices` fixes this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1922926 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-01 13:01:58 -05:00
Dimitri Savineau	7208a39e57	ceph-facts: set rgw_instances_all fact once There's no need to set the rgw_instances_all fact for each node. We can rely on run_once for that one. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-01 13:49:12 +01:00
Dimitri Savineau	195159ecef	library: retrieve realm id for zone/zonegroup When the zonegroup or the zone doesn't have a realm associated then it's not possible to modify that ressource. This patch allows to retrieve the current realm id and compare it to the realm id from the realm in parameter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 21:07:39 +01:00
Dimitri Savineau	2734a12d44	cephadm-adopt: use radosgw modules for idempotency When rerunning the cephadm-adopt.yml playbook the radosgw realm, zonegroup and zone tasks will fail because the task isn't idempotent. Using the radosgw ansible modules solves that problem. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 21:07:39 +01:00
Dimitri Savineau	523966d45f	tox: test cephadm-adopt.yml playbook idempotency Rerun the cephadm-adopt.yml playbook a second time for idempotency purpose. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 21:07:39 +01:00
Dimitri Savineau	ff9d314305	library: make cephadm_adopt module idempotent Rerunning the cephadm_adopt module on an already adopted daemon will fail because the cephadm adopt command isn't idempotent. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918424 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 21:07:39 +01:00
Dimitri Savineau	6886700a00	cephadm-adopt: make the playbook idempotent If the cephadm-adopt.yml fails during the first execution and some daemons have already been adopted by cephadm then we can't rerun the playbook because the old container won't exist anymore. Error: no container with name or ID ceph-mon-xxx found: no such container If the daemons are adopted then the old systemd unit doesn't exist anymore so any call to that unit with systemd will fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918424 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 21:07:39 +01:00
Dimitri Savineau	3749d297c7	ceph-mon: add ExecStartPre docker stop to systemd We already do that in the other systemd templates (mgr, mds, etc..) and would present to add workaround in other orchestration tool. This change is for containerized deployment only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1882724 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 09:03:34 +01:00
Guillaume Abrioux	8617081664	rgw: avoid useless call to ceph-rgw since `ceph-rgw` may be called from `ceph-handler` in some contexts we should avoid rerunning it unnecessarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-28 14:37:14 -05:00

1 2 3 4 5 ...

5620 Commits (d65c7b40354480ca3e25fd13c8f06dcd4779e002) All Branches Search

5620 Commits (d65c7b40354480ca3e25fd13c8f06dcd4779e002)

All Branches