ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	358ea3853a	tests: fix `test_nfs_is_up` test the data structure seems to have been modified in ceph@master (quincy). This commit update the test accordingly. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7e1db0b599`)	2021-03-25 14:11:11 +01:00
Guillaume Abrioux	b46d2bf0a6	ceph_volume: fix bug in `is_lv()` This function makes the `ceph_volume` module be not idempotent in containerized context because it tries to run a container and bindmount directories that no longer exist. In that case, the `lvs` command being executed returns something different than `0` so we can't call `json.loads(out)['report'][0]['lv']` since it might throw an python error. The idea is to return `True` only if `rc` is equal to `0` and `len(result)` is greater than `0`, which means the command matched an LV. Fixes: #6284 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ed79bc7a4e`)	2021-03-25 14:11:11 +01:00
Guillaume Abrioux	2cd8c3637c	fix 'command -v' tasks `command -v` is a bash script which needs a shell to run. Fixes: #6325 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `14c472707c`)	2021-03-22 13:53:11 +01:00
Guillaume Abrioux	bbf8b2fdf6	facts: fix nfs/external cluster scenario These tasks shouldn't be run when at least 1 monitor isn't present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1937997 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccd1cbb732`)	2021-03-18 06:41:00 +01:00
Guillaume Abrioux	dc2a11ce3f	config: reset num_osds When collocating OSDs with other daemon, `num_osds` is incorrectly calculated because `ceph-config` is called multiple times. Indeed, the following code: ``` num_osds: "{{ lvm_list.stdout \| default('{}') \| from_json \| length \| int + num_osds \| default(0) \| int }}" ``` makes `num_osds` be incremented each time `ceph-config` is called. We have to reset it in order to get the correct number of expected OSDs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `31a0f2653d`)	2021-03-17 17:35:52 +01:00
Guillaume Abrioux	8b86b2ede3	tests: increase nb of rerun in pytest In order to avoid false positive in the CI that I've been unable to reproduce. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f7fd1c2298`)	2021-03-12 17:52:00 +01:00
Matthew Vernon	ce25fc74eb	Docs: fix some typos While working on the previous PR, I found a couple of typos in the docs. This fixes those. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `8b1474ab75`)	2021-03-12 09:36:11 +01:00
Dimitri Savineau	6921aafb2b	debian/uca: remove the handler notification The "update apt cache" in the ceph-handler role was never called and the handler trigger after adding the uca repository doesn't exist at all. Instead of using a handler for that we can just set the update_cache parameter to true like the other apt_repository tasks. Resolve merge conflict from cherry-picking this commit. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `09d6706697`)	2021-03-11 22:06:11 +01:00
Guillaume Abrioux	e6447bdc2b	library: do not always add --yes in batch mode When asking `ceph-volume` to report only in `lvm batch` context, there's a bug described in bz1896803 [1] when `--yes` is passed (which by the way isn't necessary with `--report`). This commit ensure `--yes` isn't passed to `ceph-volume` when `--report` is used. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1896803 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896803 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe6d6ba622`)	2021-03-11 13:53:06 +01:00
Guillaume Abrioux	0d0723298f	purge: rm service-cid files This commit makes sure purge playbooks remove those file if for any reason they have been left. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1920900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b9dd253a4f`)	2021-03-11 13:52:48 +01:00
Guillaume Abrioux	932abbc8cf	switch2container: do not serialize the ceph-crash migration There's no need to slow down the playbook execution time by migrating all the `ceph-crash` instances in a serial way. Let's remove the `serial: 1` so the migration is achieved in a parallel way. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `980a5a7df4`)	2021-03-11 13:52:39 +01:00
Dimitri Savineau	8f26ffdbac	rolling_update: enforce ceph-container-engine When running the rolling_update.yml playbook and adding the dashboard component in the same time then the requirement (like container packages) aren't installed. This could lead to a failure in case of using authentication on the container registry because the playbook will try to login on the registry but podman/docker aren't yet installed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1903504 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `48a456dc8c`)	2021-03-11 13:52:21 +01:00
Dimitri Savineau	735965ef9c	ceph-common: enable rhcs tools repo for monitoring The monitoring node running grafana needs the rhcs tools repostory enabled in non containerized deployment to be able to install the ceph-grafana-dashboards rpm package. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e4dd0067c6`)	2021-03-11 13:52:21 +01:00
Dimitri Savineau	3ba27c9387	rolling_update: exclude clients from node-exporter Since `b105549` we don't install node-exporter on client nodes so we should also exclude the client node from the node-exporter upgrade. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `94af3c87d1`)	2021-03-11 13:52:02 +01:00
Guillaume Abrioux	1b424ad5e9	purge: zap and destroy db and wal devices for lvm batch Those devices (db/wal) are never zapped in lvm batch deployment. Iterating over `dedicated_devices` and `bluestore_wal_devices` fixes this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1922926 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `984191ac7f`)	2021-03-11 13:51:38 +01:00
Tyler Bishop	ba76102952	facts: support device aliases for (dedicated\|bluestore_wal)_devices Just likve `devices`, this commit adds the support for linux device aliases for `dedicated_devices` and `bluestore_wal_devices`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1919084 Signed-off-by: Tyler Bishop <tbishop@liquidweb.com> (cherry picked from commit `ee4b8804ae`)	2021-03-11 13:51:19 +01:00
Guillaume Abrioux	e3165f9a07	mon: fix cephx disabled deployment Due to missing condition on `cephx` variable, cephx disabled deployments are broken. This commit fixes this. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1910151 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4af0845702`)	2021-03-11 13:51:04 +01:00
Guillaume Abrioux	bb1f66cb51	switch2container: fix mon quorum check The current check makes no sense because it checks any of other monitor than the one being played (either a previous one already converted or a next that isn't yet converted) is present on the quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1909011 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `175ffa1b88`)	2021-03-11 13:50:27 +01:00
Guillaume Abrioux	241418409d	common: ensure shaman returns right repo Due to recent changes in shaman, there's a chance it returns the wrong repository from architecture point of view. We can query shaman and ask for the correct architecture to get around this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `39649f0ce8`)	2021-03-10 16:43:04 +01:00
Matthew Vernon	fdf437743c	Fix typo and broken link for documenting RGW frontends http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace it with a working "latest" docs link, and correct the spelling of "additional" while I'm at it. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `847611048e`)	2021-03-03 14:20:26 +01:00
Florian Haas	6fe14c6d01	requirements.txt: Move the six dependency into the general requirements config_template.py depends on six, which isn't listed in the default requirements.txt. This previously frequently wasn't a problem, because six used to be a standard package being installed into a venv, and lots of other projects depended on it. It also does get installed for unit and integration tests via tests/requirements.txt, so any broken dependency on six wouldn't be detected by tox runs. However, as other projects and distributions have phased out Python 2.7 support the dependency on six becomes less common. Thus, as long as ceph-ansible does require it for config_template.py, add it to the base requirements. Signed-off-by: Florian Haas <florian@citynetwork.eu> (cherry picked from commit `d49ea9818b`)	2021-03-03 13:22:29 +01:00
Guillaume Abrioux	c3304c213b	dashboard: add missing parameter in `ceph_cmd` the `ceph_cmd` fact is missing the `--net=host` parameter. Some tasks consuming this fact can fail like following: ``` Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f143b1a647`)	2021-03-03 12:57:08 +01:00
Guillaume Abrioux	858048560e	update: fix require-osd-release task This commit fixes two issues in rolling_update.yml: - `container_exec_cmd_update_osd` is unset in the `complete osd upgrade` play so it never runs the command in a container. - the 'require-osd-release' task is never applied because the condition looks for luminous release. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1930164 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-18 22:22:06 +01:00
Guillaume Abrioux	158224503d	defaults: update rhcs dashboard images versions The current dashboard images deployed have a bad health index. Updating to a newer version fixes this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925350 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a16ae693d8`)	2021-02-18 18:22:28 +01:00
Guillaume Abrioux	3c0a5a0b61	doc: add a note about "latest" tags See the change for details. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4e95180c80`)	2021-02-11 16:50:43 +01:00
Dimitri Savineau	a43960790f	doc: update containerized deployment This adds more documentation to the configuration and usage of containerizerd deployment. Closes: #6198 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d42d584085`)	2021-02-11 16:50:43 +01:00
Guillaume Abrioux	55d0c79046	tests: install correct ansible-lint version We need to pin the ansible-lint version depending on python version being used. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 08:32:24 +01:00
Guillaume Abrioux	8fada83589	tests: set `mon_max_pg_per_osd` in rgw_multisite Otherwise, the job fails when it tries to create a bucket with `s3cmd mb` command because we have too many PGs per OSD. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `54bae480d2`)	2021-02-10 08:32:24 +01:00
Guillaume Abrioux	b5d082c4bc	rgw: fix a typo in multisite if `rgw_zonegroupmaster` is not defined at the rgw instance level in `rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `931b87e830`)	2021-02-10 08:32:24 +01:00
Guillaume Abrioux	920f07514a	rgw: quick fix in create_zone_user.yml typical error: ``` 2021-02-01 03:11:09,809 p=93834 u=cephuser n=ansible \| TASK [ceph-rgw : check if the realm system user already exists] ************************************************************************************************************************************************* 2021-02-01 03:11:09,809 p=93834 u=cephuser n=ansible \| Monday 01 February 2021 03:11:09 -0500 (0:00:00.084) 0:14:38.607 ***** 2021-02-01 03:11:09,836 p=93834 u=cephuser n=ansible \| fatal: [ceph-kvm-ms2-1611241931591-node7-rgw]: FAILED! => msg: \|- The task includes an option with an undefined variable. The error was: 'None' has no attribute 'realm' ``` This task should be skipped when `zone_users` is undefined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1922998 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-01 11:28:57 -05:00
Dimitri Savineau	6278c5a4e3	ceph-mon: add ExecStartPre docker stop to systemd We already do that in the other systemd templates (mgr, mds, etc..) and would present to add workaround in other orchestration tool. This change is for containerized deployment only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1882724 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3749d297c7`)	2021-01-29 12:00:14 -05:00
Guillaume Abrioux	aeee3471e3	rgw: avoid useless call to ceph-rgw since `ceph-rgw` may be called from `ceph-handler` in some contexts we should avoid rerunning it unnecessarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8617081664`)	2021-01-28 16:37:50 -05:00
Guillaume Abrioux	b903446fa4	containers: use --cpus instead --cpu-quota When using docker 1.13.1, the current condition: ``` {% if (container_binary == 'docker' and ceph_docker_version.split('.')[0] is version_compare('13', '>=')) or container_binary == 'podman' -%} ``` is wrong because it compares the first digit (1) whereas it should compare the second one. It means we always use `--cpu-quota` although documentation recommend using `--cpus` when docker version is 1.13.1 or higher. From the doc: > --cpu-quota=<value> Impose a CPU CFS quota on the container. The number of > microseconds per --cpu-period that the container is limited to before > throttled. As such acting as the effective ceiling. > If you use Docker 1.13 or higher, use --cpus instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3e262e072b`)	2021-01-28 16:37:50 -05:00
Guillaume Abrioux	14267fe0c4	rgw: multisite refact Add the possibility to deploy rgw multisite configuration with a mix of secondary and primary zones on a same rgw node. Before that, on a same node, all instances were either primary zones OR secondary. Now you can define a rgw instance like following: ``` rgw_instances: - instance_name: 'rgw0' rgw_zonemaster: false rgw_zonesecondary: true rgw_zonegroupmaster: false rgw_realm: 'france' rgw_zonegroup: 'zonegroup-france' rgw_zone: paris-00 radosgw_address: "{{ _radosgw_address }}" radosgw_frontend_port: 8080 rgw_zone_user: jacques.chirac rgw_zone_user_display_name: "Jacques Chirac" system_access_key: P9Eb6S8XNyo4dtZZUUMy system_secret_key: qqHCUtfdNnpHq3PZRHW5un9l0bEBM812Uhow0XfB endpoint: http://192.168.101.12:8080 ``` Basically it's now possible to define `rgw_zonemaster`, `rgw_zonesecondary` and `rgw_zonegroupmaster` at the intsance level instead of the whole node level. Also, this commit adds an option `deploy_secondary_zones` (default True) which can be set to `False` in order to explicitly ask the playbook to not deploy secondary zones in case where the corresponding endpoint are not deployed yet. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915478 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71a5e666e3`)	2021-01-28 16:37:50 -05:00
Guillaume Abrioux	a36eee1852	fs2bs: skip migration when a mix of fs and bs is detected Since the default of `osd_objectstore` has changed as of 3.2, some deployments might have a mix of filestore and bluestore OSDs on a same node. In some specific cases, there's a possibility that a filestore OSD shares a journal/db device with a bluestore OSD. We shouldn't try to redeploy in this context because ceph-volume will complain. (either because in lvm batch you can't pass partition or about gpt header). The safest option is to skip the migration on the node when such a mix is detected or force all osds including those already using bluestore (option `force_filestore_to_bluestore=True` has to be passed as an extra var). If all OSDs are using filestore, then they will be migrated to bluestore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1875777 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e66f12d138`)	2021-01-22 11:37:40 -05:00
Dimitri Savineau	07d2160421	dashboard: manage password backward compatibility The ceph dashboard changed the way the password are provided via the CLI. This breaks the backward compatibility when using a recent ceph-ansible version with ceph release without that feature. This patch adds tasks for legacy workflow (ceph release without that feature) in ceph-dashboard role. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915506 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-19 18:05:02 +01:00
Guillaume Abrioux	623ca14682	dashboard: configure passwords via stdin Due to recent changes in ceph, the few dashboard passwors must be passed via `-i` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ef975ef5ea`)	2021-01-19 18:05:02 +01:00
Dimitri Savineau	4335fed787	library: remove containerized parameter from cv The ceph-volume module relies on environment variables to determine if the command should be executed within a container or not. The containerized parameter isn't used anymore and we can remove it. Fixes: #6153 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `613ab11b9b`)	2021-01-06 16:56:20 +01:00
Mike Currin	360a2d2b30	Path for ceph config missing in crash template The path where ceph.conf is located (/etc/ceph) missing in the Docker container bind mounts, this throws errors Signed-off-by: Mike Currin <currin@gmail.com> (cherry picked from commit `4cbc9a48c9`)	2021-01-06 16:55:39 +01:00
Guillaume Abrioux	290d3ef369	rgw: support switching from single-site to multisite When collocating rgw with either a mon, mgr or osd, switching from single site to a multisite rgw setup failed because of the handlers triggered between the ansible play of the collocated daemon and the play of the rgw. Since the multisite changes are not yet applied the handlers fail. The idea here is to ensure we run the multisite configuration from the ceph-handler role before the restart happens, this way it won't complain because of non existing multisite configuration. (Note: this is also valid when simply changing a multisite configuration) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1888630 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `513c8cfe55`)	2021-01-06 10:38:50 -05:00
Guillaume Abrioux	607ef5a7d2	common: do not use pipefail when not needed Let's discard the ansible lint error 306 and add a "# noqa 306" on tasks where we don't need `set -o pipefail` Fixes: #6090 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `86a8889ee3`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	6855feb604	ceph-osd: refact `docker_exec_start_osd` This commit drops nested jinja construction in this set_fact task. It also rename it to `container_exec_start_osd` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ff95fa9c32`)	2020-12-16 14:05:45 +01:00
Dimitri Savineau	49522f46b1	workflow/pytest: update python matrix version On this branch we should test pytest against python 2.7 and 3.6. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	dc4523a0c1	tests: use github workflow for nbsp char check Let's use a github workflow instead of travis for this. With this commit we can get rid of Travis. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `94c37b9de8`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	ba312a5b5d	lint: ignore 302,303,505 errors ignore 302,303 and 505 errors [302] Using command rather than an argument to e.g. file [303] Using command rather than module [505] referenced files must exist they aren't relevant on these tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `195d88fcda`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	8a8a082693	lint: do not use 'local_action' Fix ansible-lint 504 error: [504] Do not use 'local_action', use 'delegate_to: localhost' Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c948b668eb`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	ace031e86e	lint: trailing whitespace Fix ansible-lint 201 error: [201] Trailing whitespace Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `dfc7e6e4bd`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	72fc8877cb	lint: all tasks should be named Fix ansible-lint 502 error: [502] All tasks should be named Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `97dd9218dd`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	ab62d27c44	lint: use shell only when shell functionality is required Fix ansible-lint 305 error: [305] Use shell only when shell functionality is required Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `11b4bf5083`)	2020-12-16 14:05:45 +01:00
Guillaume Abrioux	2a0e07cfd7	lint: don't compare to literal true/false Fix ansible lint 601 error: [601] Don't compare to literal True/False Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2011e4dbc8`)	2020-12-16 14:05:45 +01:00

... 2 3 4 5 6 ...

5514 Commits (2c9fc7f5172d81d4de60fd41351892b767aaf27a) All Branches Search

5514 Commits (2c9fc7f5172d81d4de60fd41351892b767aaf27a)

All Branches