ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Francesco Pantano	e65f9a5c72	Fix hosts field in rolling_update playbook when mds are processed In the OSP context, during the rolling update the playbook fails with the following error: ''' ERROR! The field 'hosts' has an invalid value, which includes an undefined variable. The error was: list object has no element 0 ''' This PR just change the hosts field providing a valid mons group value. Closes: https://bugzilla.redhat.com/1876803 Signed-off-by: Francesco Pantano <fpantano@redhat.com>	2020-09-08 11:52:08 -04:00
Francesco Pantano	cb64df30b6	Add --cluster option on ceph require-osd-release command On DCN environments, or when multiple ceph cluster are configured, we need to specify the cluster name before running the command or the rolling_update playbook will fail during minor updates. Closes: https://bugzilla.redhat.com/1876447 Signed-off-by: Francesco Pantano <fpantano@redhat.com>	2020-09-07 16:31:14 +02:00
Guillaume Abrioux	7348e9a253	tests: disable nfs-ganesha testing This commit diables nfs-ganesha testing on master for non-containerized deployment because the dev repos are broken at the moment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-07 12:54:29 +02:00
Guillaume Abrioux	2cbb7de3b2	tests: migrate to quay.ceph.io registry in order to avoid docker.io rate limiting Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-07 12:54:29 +02:00
Dai Dang Van	ae38b01d08	Fix typo shrink osd file name in day-2 docs Signed-off-by: Dai Dang Van <daikk115@gmail.com>	2020-09-03 09:20:47 -04:00
Dimitri Savineau	4f308dcf4a	tests: reenable ceph-iscsi testing This re-adds the ceph-iscsi testing for both non containerized and containerized deployment since the rados connection error on ceph dev has been fixed [1]. [1] https://tracker.ceph.com/issues/47002 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-27 11:13:36 -04:00
Niko Smeds	a951c1a3f0	Enable HAProxy backend checks for Ceph RGW Add the `check` option to server definitions to enable basic HAProxy health checks for Ceph RADOS gateway backends. Currently traffic will be forwarded to unhealthly `radosgw.service` servers. These changes resolve the issue. Signed-off-by: Niko Smeds nikosmeds@gmail.com	2020-08-27 10:57:46 -04:00
Guillaume Abrioux	cec994b973	rolling_update: remove 'ignore_errors' There's no need to use `ignore_errors: true` on these tasks. Using a loop on the task stopping mon daemons allows us to avoid duplicating this task, the `ignore_errors` isn't needed here because it won't fail the playbook if one of the ID doesn't exist (shortname vs. fqdn) Using the right condition on the task starting the mgr daemon allows us to avoid using an `ignore_errors: true` as well. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:22:36 -04:00
Guillaume Abrioux	13e2311cbe	ceph_key: refact the code and minor fixes This commit refactors the code to remove a duplicate condition and it makes the `state: absent` code idempotent Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:44:47 +02:00
Guillaume Abrioux	27ca884d99	tests: add more coverage for test_ceph_key This commit adds more coverage regarding the testing of ceph_key module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:44:47 +02:00
Guillaume Abrioux	54d3e9650f	dashboard: refact admin user creation task this commit splits this task in order to avoid using a `shell` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:22:11 +02:00
Guillaume Abrioux	f0fe193d8e	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 11:16:26 -04:00
Dimitri Savineau	6c11695fbe	tests: reenable nfs-ganesha testing This re-adds the nfs-ganesha testing in non containerized deployment. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-20 16:58:54 +02:00
George Shuklin	73d4bb6bd6	Make 'disable ssl for dashboard task' idempotent. This should reduce number of 'changed' tasks during convergence test. Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2020-08-20 16:48:32 +02:00
Rafał Wądołowski	55cd6e83e4	Comment out ceph_custom_key Since there is a check if ceph_custom_key is defined, there is no reason to define it by default. Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>	2020-08-20 13:36:24 +02:00
Guillaume Abrioux	899d317196	iscsigw: add retry/until In order to avoid failures that could be fixed by simply retrying. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 13:25:05 +02:00
Guillaume Abrioux	8476beb5b1	tests: move erasure pool testing in lvm_osds This commit moves the erasure pool creation testing from `all_daemons` to `lvm_osds` so we can decrease the number of osd nodes we spawn so the OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based scenario is being tested. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 11:50:28 +02:00
John Fulton	95dee6f1ca	Set default permission for prometheus config files Regardless of the outcome of Ansible 2.9.12 issue 71200 we can set a default permission for these files. Closes: https://github.com/ceph/ceph-ansible/issues/5677 Signed-off-by: John Fulton <fulton@redhat.com>	2020-08-18 15:49:31 -04:00
Guillaume Abrioux	51c382677d	shrink-mds: use mds_to_kill_hostname instead When using fqdn in inventory host file, this task will fail because the mds is registered with its shortname. It means we must use `mds_to_kill_hostname` in this task. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1869837 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 14:56:57 -04:00
Guillaume Abrioux	8ed11ea3ee	infra: only install logrotate on right nodes For intsance, there is no need to install logrotate on clients nodes. This also ensure logrotate is installed only for containerized deployments since the packaging has an explicit dependency to logrotate Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 10:56:09 -04:00
Guillaume Abrioux	04d77dcaeb	travis: enforce ansible-lint 4.2.0 Let's pin to 4.2.0 (because of ansible/ansible-lint/issues/966) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 10:29:19 -04:00
Guillaume Abrioux	093e1dcb21	tests: remove hosts-ubuntu inventories Since we've dropped ubuntu testing, we don't need these inventories anymore. Let's remove this leftover. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 11:20:48 +02:00
Guillaume Abrioux	bd9e126357	tests: disable iscsigw testing (container) Temporarily disable iscsigw testing for containerized deployments because it's broken upstream on ceph@master. non-containerized deployments use stable build for iscsigw to get around this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 11:20:48 +02:00
Dimitri Savineau	cb8f0237e1	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-17 22:59:06 +02:00
Dimitri Savineau	9805589ef9	container: don't install the engine on all clients We only need the container engine to be installed on the first clients node in order to execute the pools/keys operation. We already do the same worflow with the ceph-container-common role which pull the ceph container image. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-17 22:57:28 +02:00
Ali Maredia	5c1f4b1a1e	rgw: allow rgws to be concurrently with or without multisite Allows rgws in a ceph cluster to be run with multisite and without multisite at the same time. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-08-17 11:11:11 +02:00
Guillaume Abrioux	f77fa6e2a4	purge-cluster: use sysfs method for unmapping rbd devices This way we keep consistency with purge-container-cluster.yml playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-17 09:28:12 +02:00
Guillaume Abrioux	e1cb385740	infra: add missing tag This commit adds the missing `with_pkg` tag on the logrotate installation task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-13 10:08:18 -04:00
Guillaume Abrioux	e256d8e948	tests: test iscsigw against stable Since it is broken at the moment with dev repos, let's test against stable builds so the CI is unlocked. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-13 09:49:00 +02:00
Guillaume Abrioux	33a544644a	purge: import ceph-defaults in purge osd play Otherwise, `ceph_volume_debug` variable is undefined Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Guillaume Abrioux	f1aa6cea21	infra: add log rotation support (containers) This commit adds the log rotation support via logrotate in containerized deployments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Guillaume Abrioux	448cc280b7	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
raul	110eaf5f9f	rgw: support 1+ rgw instance in `radosgw_frontend_port` Change the radosgw_frontend_port to take in account more than 1 RGW instance, in it's original form `radosgw_frontend_port: radosgw_frontend_port \| int`, it configured the 8080 port to all instances, with the following modification `radosgw_frontend_port: radosgw_frontend_port \| int + item\|int` we increase in 1 the port count. Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com>	2020-08-11 14:05:43 +02:00
Guillaume Abrioux	dd4b5b0328	nfs: do not copy rgw keyring when `nfs_obj_gw` is true This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the cluster doesn't contain a rgw node, which can be the case given we are using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the deployment will fail trying to copy a key that doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-07 13:21:17 +02:00
Guillaume Abrioux	83d1b33a9b	tox: only wait 30sec for right jobs There's no need to call `sleep 30` for other job than `all_daemons` and `all_in_one`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-06 17:22:52 +02:00
Benoît Knecht	a57fd7a090	purge-cluster: check if rbdmap exists When running `infrastructure-playbooks/purge-cluster.yml` twice, it fails the second time on the `ensure rbd devices are unmapped` task, because `rbdmap` isn't installed anymore at that point. This commit adds a check that ensures `rbdmap` is available, and skips the `ensure rbd devices are unmapped` task if it isn't. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-08-06 09:35:03 +02:00
Dimitri Savineau	03d4620269	pytest: register ceph_crash mark Otherwise we see some pytest warning. PytestUnknownMarkWarning: Unknown pytest.mark.ceph_crash - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-06 09:34:34 +02:00
Guillaume Abrioux	c2e507b42d	purge-cluster: replace shell by command in a task There is no need to use `shell` here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-05 09:37:41 +02:00
Benoît Knecht	fe8fbd3ee2	shrink-osd: various fixes This handles missing /etc/ceph/osd, by ensuring we actually found files in `/etc/ceph/osd` before trying to slurp their content. This also add a missing `\| default(False)` to avoid fowlloing error: ``` fatal: [ceph01]: FAILED! => msg: \|- The conditional check 'ceph_osd_data_json[item.2]['encrypted'] \| bool' failed. The error was: error while evaluating conditional (ceph_osd_data_json[item.2]['encrypted'] \| bool): 'dict object' has no attribute 'encrypted' ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1862416 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-08-05 01:30:57 +02:00
Kevin Coakley	d19e6033b2	Remove ceph-radosgw.target when switching to containerize daemons The task "remove old systemd unit file" under "switching from non-containerized to containerized ceph rgw" only removes the ceph-radosgw@.service file. The task should also remove the ceph-radosgw.target file, like the "remove old systemd unit files" tasks for the mons, mgrs, osds, etc, in order to clean up all of the unused systemd unit files. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>	2020-08-04 11:08:12 -04:00
Guillaume Abrioux	5df6225ede	tests: change subnet in lvm_osds container scenario This commit changes the subnets in container-lvm_osds scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-04 14:00:05 +02:00
Guillaume Abrioux	0a38d91b5b	Revert "tests: add more coverage for test_ceph_key" This reverts commit `1e46264bc1`.	2020-08-04 11:28:42 +02:00
Guillaume Abrioux	b15063b20e	Revert "ceph_key: refact the code and minor fixes" This reverts commit `9a950b8f0f`.	2020-08-04 11:28:42 +02:00
Guillaume Abrioux	9a950b8f0f	ceph_key: refact the code and minor fixes wip Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 18:12:45 +02:00
Guillaume Abrioux	1e46264bc1	tests: add more coverage for test_ceph_key This commit adds more coverage regarding the testing of ceph_key module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 18:12:45 +02:00
Guillaume Abrioux	0a581a6e60	config: only add related rgw section there's no need to add each rgw section on all rgw nodes. With this commit, only related rgw section are rendered. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:47:27 +02:00
Guillaume Abrioux	8933bfde33	shrink_osd: remove osd data directory Otherwise it leaves an empty directory. When shrinking and redeploying multiple OSDs you have no guarantee it will reuse the same osd id. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:46:56 +02:00
Guillaume Abrioux	78e4faf077	tox: split shrink_osd scenario Let's split this scenario with a dedicated tox ini file. This is for testing in two ways: 1/ shrinking OSDs one by one 2/ shrinking multiple OSDs with a single call of the playbook ceph-build related PR: ceph/ceph-build#1629 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:46:56 +02:00
Guillaume Abrioux	7efea219d6	tests: refact shrink_osd scenario This adds more coverage on the shrink_osd scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:46:56 +02:00
Dimitri Savineau	0d0f1e71df	dashboard: allow remote TLS cert/key copy When using TLS on the ceph dashboard or grafana services, we can provide the TLS certificate and key. Those files should be present on the ansible controller and they will be copyied to the right node(s). In some situation, the TLS certificate and key could be already present on the target node and not on the ansible controller. For this scenario, we just need to copy the files locally (on each remote host). This patch adds the dashboard_tls_external variable (with default to false) to allow users to achieve this scenario when configuring this variable to true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-03 13:39:47 +02:00

1 2 3 4 5 ...

5466 Commits (b9cb0f12e9e79600f1a974dd88ba1ed1d833211f) All Branches Search

5466 Commits (b9cb0f12e9e79600f1a974dd88ba1ed1d833211f)

All Branches