ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	b01b255414	ceph-nfs: add nfs-ganesha-rados-urls package Since nfs-ganesha 2.8.3 the rados-urls library has been move to a dedicated package. We don't have the same nfs-ganesha 2.8.x between the community and rhcs repositories. community: 2.8.1 rhcs: 2.8.3 As a workaround we will install that package only for rhcs setup. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0a3e85e8ca`)	2020-02-17 10:00:44 -05:00
Dimitri Savineau	6864d04fdf	ceph-nfs: fix ceph_nfs_ceph_user variable The ceph_nfs_ceph_user variable is a string for the ceph-nfs role but a list in ceph-client role. `6a6785b` introduced a confusion between both variable type in the ceph-nfs role for external ceph with ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1801319 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `10951eeea8`)	2020-02-17 15:27:30 +01:00
Dimitri Savineau	e4e1b386b0	dashboard: allow configuring multiple grafana host When using multiple grafana hosts then we push set the grafana and prometheus URL and push the dashboard layout to a single node. grafana_server_addrs is the list of all grafana nodes and used during the ceph-dashboard role (on mgr/mon nodes). grafana_server_addr is the current grafana node used during the ceph-grafana and ceph-prometheus role (on grafana-server nodes). We don't have the grafana_server_addr fact duplication code between external vs collocated nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1784011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c6e96699f7`)	2020-02-12 19:56:31 -05:00
Guillaume Abrioux	1d2a395aaf	switch_to_containers: increase health check values This commit increases the default values for the following variable consumed in switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. This also moves these variables in `ceph-defaults` role so the user can set different values if needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783223 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3700aa5385`)	2020-02-10 12:57:17 -05:00
Guillaume Abrioux	cdc3e10cf3	purge/update: remove backward compatibility legacy This was introduced in 3.1 and marked as deprecation We can definitely drop it in stable-4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0441812959`)	2020-02-03 09:33:05 -05:00
Stanley Lam	0336a1476f	Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances. Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com> (cherry picked from commit `ad7a5dad3f`)	2020-02-03 09:32:43 -05:00
Guillaume Abrioux	5c3ba0787c	switch_to_containers: exclude clients nodes from facts gathering just like site.yml and rolling_update, let's exclude clients node from the fact gathering. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `332c39376b`)	2020-02-03 09:32:20 -05:00
Dimitri Savineau	0dbca448d1	ceph-handler: Use /proc/net/unix for rgw socket If for some reason, there's an old rgw socket file present in the /var/run/ceph/ directory then the test command could fail with test: xxxxxxxxx.asok: binary operator expected $ ls -hl /var/run/ceph/ total 0 srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok We can check the radosgw socket in /proc/net/unix to avoid using wildcard in the socket name. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60cbfdc2a6`)	2020-02-03 09:31:52 -05:00
Dimitri Savineau	487be2675a	filestore-to-bluestore: skip bluestore osd nodes If the OSD node is already using bluestore OSDs then we should skip all the remaining tasks to avoid purging OSD for nothing. Instead we warn the user. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790472 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `83c5a1d7a8`)	2020-02-03 15:16:51 +01:00
Dimitri Savineau	460d3557d7	ceph-container-engine: lvm2 on OSD nodes only Since `de8f2a9` the lvm2 package installation has been moved from ceph-osd role to ceph-container-engine role. But the scope wasn't limited to the OSD nodes only. This commit fixes this behaviour. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fa8aa8c864`)	2020-02-03 15:16:32 +01:00
Guillaume Abrioux	675b6788f4	update: remove legacy tasks These tasks should have been removed with backport #4756 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1793564 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-03 15:16:13 +01:00
Dimitri Savineau	80f1b0feb0	ceph-common: rhcs 4 repositories for rhel 7 RHCS 4 is available for both RHEL 7 and 8 so we should also enable the cdn repositories for that distribution. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796853 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9b40a959b9`)	2020-02-03 15:15:35 +01:00
Mike Christie	76753e64f9	iscsi: Fix crashes during rolling update During a rolling update we will run the ceph iscsigw tasks that start the daemons then run the configure_iscsi.yml tasks which can create iscsi objects like targets, disks, clients, etc. The problem is that once the daemons are started they will accept confifguration requests, or may want to update the system themself. Those operations can then conflict with the configure_iscsi.yml tasks that setup objects and we can end up in crashes due to the kernel being in a unsupported state. This could also happen during creation, but is less likely due to no objects being setup yet, so there are no watchers or users accessing the gws yet. The fix in this patch works for both update and initial setup. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1795806 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `77f3b5d51b`)	2020-02-03 15:15:15 +01:00
wujie1993	dcd4b2955a	purge: fix purge cluster failed Fix purge cluster failed when local container images does not exist. Purge node-exporter and grafana-server only when dashboard_enabled is set to True. Signed-off-by: wujie1993 qq594jj@gmail.com (cherry picked from commit `d8b0b3cbd9`)	2020-02-03 15:14:56 +01:00
Guillaume Abrioux	1b33c5358e	config: fix external client scenario When no monitor group is present in the inventory, this task fails. This affects only non-containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e7bc079405`)	2020-01-31 13:37:10 +01:00
Guillaume Abrioux	e3cd719ebe	tests: add external_clients scenario This commit adds a new 'external ceph clients' scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `641729357e`)	2020-01-31 13:37:10 +01:00
Dimitri Savineau	3daea719b6	ceph-defaults: remove rgw from ceph_conf_overrides The [rgw] section in the ceph.conf file or via the ceph_conf_overrides variable doesn't exist and has no effect. To apply overrides to all radosgw instances we should use either the [global] or [client] sections. Overrides per radosgw instance should still use the [client.rgw.{instance-name}] section. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794552 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2f07b85131`)	2020-01-29 14:34:34 +01:00
Guillaume Abrioux	bc6777c6df	dashboard: add quotes when passing password to the CLI Otherwise, if the variables contains a '$' it will be interpreted as a BASH variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c3759f8ce`)	2020-01-29 14:15:41 +01:00
Guillaume Abrioux	2e7d7b70ed	tests: set dashboard\|grafana_admin_password Set these 2 variables in all test scenarios where `dashboard_enabled` is `True` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c040199c8f`)	2020-01-29 14:15:41 +01:00
Guillaume Abrioux	8a907cb1ca	validate: fail if dashboard\|grafana_admin_password aren't set This commit adds a task to make sure user set a custom password for `grafana_admin_password` and `dashboard_admin_password` variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795509 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `99328545de`)	2020-01-29 14:15:41 +01:00
Dimitri Savineau	9da917501b	ceph-facts: fix _container_exec_cmd fact value When using different name between the inventory_hostname and the ansible_hostname then the _container_exec_cmd fact will get a wrong value based on the inventory_hostname instead of the ansible_hostname. This happens when the ceph cluster is already running (update/upgrade). Later the container exec commands will fail because the container name is wrong. We should always set the _container_exec_cmd based on the ansible_hostname fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1fcafffdad`)	2020-01-29 11:48:44 +01:00
Dimitri Savineau	44eadcd05e	tox: set extras vars for filestore-to-bluestore The ansible extra variables aren't set with the ansible-playbook command running the filestore-to-bluestore playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a27290bf98`)	2020-01-28 22:21:49 -05:00
Dimitri Savineau	f982a70f02	filestore-to-bluestore: fix undefine osd_fsid_list If the playbook is used on a host running bluestore OSDs then the osd_fsid_list won't be filled because the bluestore OSDs are reported with 'type: block' via ceph-volume lvm list command but we are looking for 'type: data' (filestore). TASK [zap ceph-volume prepared OSDs] ********* fatal: [xxxxx]: FAILED! => msg: '''osd_fsid_list'' is undefined Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cd76054f76`)	2020-01-28 22:21:49 -05:00
Guillaume Abrioux	d5dca5087a	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3e7dbb4b16`)	2020-01-27 17:54:39 -05:00
Guillaume Abrioux	0d2af6ebf3	fix calls to `container_exec_cmd` in ceph-osd role We must call `container_exec_cmd` from the right monitor node otherwise the value of the fact might mistmatch between the delegated node and the node being played. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f919f8971`)	2020-01-27 17:54:39 -05:00
Dimitri Savineau	0a2927ce5e	filestore-to-bluestore: don't fail when with no PV When the PV is already removed from the devices then we should not fail to avoid errors like: stderr: No PV found on device /dev/sdb. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a9c2300545`)	2020-01-24 16:14:47 -05:00
Guillaume Abrioux	9fb69e13ed	handler: read container_exec_cmd value from first mon Given that we delegate to the first monitor, we must read the value of `container_exec_cmd` from this node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eb9112d8fb`)	2020-01-23 18:34:14 +01:00
Vytenis Sabaliauskas	4152a1a862	ceph-facts: Fix for 'running_mon is undefined' error, so that fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0' Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com> (cherry picked from commit `ed1eaa1f38`)	2020-01-23 11:24:24 -05:00
Dimitri Savineau	add4089e30	site-container: don't skip ceph-container-common On HCI environment the OSD and Client nodes are collocated. Because we aren't running the ceph-container-common role on the client nodes except the first one (for keyring purpose) then the ceph-role execution fails due to undefined variables. Closes: #4970 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794195 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `671b1aba3c`)	2020-01-23 16:27:31 +01:00
Guillaume Abrioux	fd217d9f08	rolling_update: support upgrading 3.x + ceph-metrics on a dedicated node When upgrading from RHCS 3.x where ceph-metrics was deployed on a dedicated node to RHCS 4.0, it fails like following: ``` fatal: [magna005]: FAILED! => changed=false gid: 0 group: root mode: '0755' msg: 'chown failed: failed to look up user ceph' owner: root path: /etc/ceph secontext: unconfined_u:object_r:etc_t:s0 size: 4096 state: directory uid: 0 ``` because we are trying to run `ceph-config` on this node, it doesn't make sense so we should simply run this play on all groups except `[grafana-server]`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1793885 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5812fe45b`)	2020-01-22 18:28:54 +01:00
Dimitri Savineau	0abea70e29	filestore-to-bluestore: fix osd_auto_discovery When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bb3eae0c80`)	2020-01-22 10:06:17 +01:00
Dimitri Savineau	e4965e9ea9	filestore-to-bluestore: --destroy with raw devices We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f995b079a6`)	2020-01-21 18:26:55 +01:00
Dimitri Savineau	6a51330892	ceph-osd: set container objectstore env variables Because we need to manage legacy ceph-disk based OSD with ceph-volume then we need a way to know the osd_objectstore in the container. This was done like this previously with ceph-disk so we should also do it with ceph-volume. Note that this won't have any impact for ceph-volume lvm based OSD. Rename docker_env_args fact to container_env_args and move the container condition on the include_tasks call. Remove OSD_DMCRYPT env variable from the ceph-osd template because it's now included in the container_env_args variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792122 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c9e1fe3d92`)	2020-01-20 15:36:11 -05:00
Benoît Knecht	ff2a2bb870	ceph-rgw: Fix customize pool size "when" condition In `3c31b19ab3`, I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `3842aa1a30`)	2020-01-20 12:48:19 -05:00
Guillaume Abrioux	1462423059	handler: fix call to container_exec_cmd in handler_osds When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22865cde9c`)	2020-01-20 12:45:51 -05:00
Dmitriy Rabotyagov	8d311a537d	Fix undefined running_mon Since commit [1] running_mon introduced, it can be not defined which results in fatal error [2]. This patch defines default value which was used before patch [1] Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com> [1] `8dcbcecd71` [2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011 (cherry picked from commit `2478a7b948`)	2020-01-16 18:28:12 -05:00
Guillaume Abrioux	cae24dd85a	remove container_exec_cmd_mgr fact Iterating over all monitors in order to delegate a ` {{ container_binary }}` fails when collocating mgrs with mons, because ceph-facts reset `container_exec_cmd` to point to the first member of the monitor group. The idea is to force `container_exec_cmd` to be reset in ceph-mgr. This commit also removes the `container_exec_cmd_mgr` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8dcbcecd71`)	2020-01-15 21:10:54 +01:00
Guillaume Abrioux	0db611ebf8	shrink-mds: fix condition on fs deletion the new ceph status registered in `ceph_status` will report `fsmap.up` = 0 when it's the last mds given that it's done after we shrink the mds, it means the condition is wrong. Also adding a condition so we don't try to delete the fs if a standby node is going to rejoin the cluster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787543 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3d0898aa5d`)	2020-01-15 11:28:12 +01:00
Dimitri Savineau	09a71e4a8c	ceph-iscsi: don't use bracket with trusted_ip_list The trusted_ip_list parameter for the rbd-target-api service doesn't support ipv6 address with bracket. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bd87d69183`)	2020-01-14 12:48:04 -05:00
Dimitri Savineau	ff3a3ee5e9	container: move lvm2 package installation Before this patch, the lvm2 package installation was done during the ceph-osd role. However we were running ceph-volume command in the ceph-config role before ceph-osd. If lvm2 wasn't installed then the ceph-volume command fails: error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or directory This wasn't visible before because lvm2 was automatically installed as docker dependency but it's not the same for podman on CentOS 8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `de8f2a9f83`)	2020-01-14 12:47:55 -05:00
Guillaume Abrioux	a81830ddc0	osd: use _devices fact in lvm batch scenario since `fd1718f379`, we must use `_devices` when deploying with lvm batch scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5558664f37`)	2020-01-14 15:33:15 +01:00
Guillaume Abrioux	ffdfa634ac	osd: do not run openstack_config during upgrade There is no need to run this part of the playbook when upgrading the cluter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `af6875706a`)	2020-01-14 09:12:34 -05:00
Guillaume Abrioux	51596e8b32	tests: use main playbook for add_osds job This commit replaces the playbook used for add_osds job given accordingly to the add-osd.yml playbook removal Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fef1cd4c4b`)	2020-01-14 09:12:34 -05:00
Guillaume Abrioux	2d85fab02d	osd: support scaling up using --limit This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3496a0efa2`)	2020-01-14 09:12:34 -05:00
Dimitri Savineau	dc797971ce	ceph-facts: move grafana fact to dedicated file We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f940e695ab`)	2020-01-13 16:28:23 -05:00
Guillaume Abrioux	266c4c7763	facts: fix osp/ceph external use case `d6da508a9b` broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2592a1e1e8`)	2020-01-13 21:07:01 +01:00
Guillaume Abrioux	532abbb9b2	defaults: change monitor\|radosgw_address default values To avoid confusion, let's change the default value from `0.0.0.0` to `x.x.x.x`. Users might think setting `0.0.0.0` will make the daemon binding on all interfaces. Fixes: #4827 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc02fc98eb`)	2020-01-13 14:55:23 -05:00
Guillaume Abrioux	9ed540da7e	osd: ensure osd ids collected are well restarted This commit refact the condition in the loop of that task so all potential osd ids found are well started. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `58e6bfed2d`)	2020-01-13 14:31:03 -05:00
Guillaume Abrioux	fc7212b192	tests: add time command in vagrant_up.sh monitor how long it takes to get all VMs up and running Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `16bcef4f28`)	2020-01-10 17:41:27 +01:00
Guillaume Abrioux	2c96155c32	tests: retry to fire up VMs on vagrant failure Add a script to retry several times to fire up VMs to avoid vagrant failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `1ecb3a9352`)	2020-01-10 17:41:27 +01:00

1 2 3 4 5 ...

5067 Commits (55c222d088b7ca15a20e0440bd52d1c94b035a93) All Branches Search

5067 Commits (55c222d088b7ca15a20e0440bd52d1c94b035a93)

All Branches