ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	1fcafffdad	ceph-facts: fix _container_exec_cmd fact value When using different name between the inventory_hostname and the ansible_hostname then the _container_exec_cmd fact will get a wrong value based on the inventory_hostname instead of the ansible_hostname. This happens when the ceph cluster is already running (update/upgrade). Later the container exec commands will fail because the container name is wrong. We should always set the _container_exec_cmd based on the ansible_hostname fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-29 08:44:59 +01:00
Dimitri Savineau	a27290bf98	tox: set extras vars for filestore-to-bluestore The ansible extra variables aren't set with the ansible-playbook command running the filestore-to-bluestore playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-28 02:42:39 +01:00
Dimitri Savineau	cd76054f76	filestore-to-bluestore: fix undefine osd_fsid_list If the playbook is used on a host running bluestore OSDs then the osd_fsid_list won't be filled because the bluestore OSDs are reported with 'type: block' via ceph-volume lvm list command but we are looking for 'type: data' (filestore). TASK [zap ceph-volume prepared OSDs] ********* fatal: [xxxxx]: FAILED! => msg: '''osd_fsid_list'' is undefined Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-28 02:42:39 +01:00
Guillaume Abrioux	3e7dbb4b16	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-27 15:30:45 -05:00
Guillaume Abrioux	2f919f8971	fix calls to `container_exec_cmd` in ceph-osd role We must call `container_exec_cmd` from the right monitor node otherwise the value of the fact might mistmatch between the delegated node and the node being played. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-27 15:30:45 -05:00
Dimitri Savineau	83c5a1d7a8	filestore-to-bluestore: skip bluestore osd nodes If the OSD node is already using bluestore OSDs then we should skip all the remaining tasks to avoid purging OSD for nothing. Instead we warn the user. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790472 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-27 18:08:00 +01:00
Dimitri Savineau	a9c2300545	filestore-to-bluestore: don't fail when with no PV When the PV is already removed from the devices then we should not fail to avoid errors like: stderr: No PV found on device /dev/sdb. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-24 20:56:08 +01:00
Dmitriy Rabotyagov	0961ab8e60	Ensure that ganesha log directory exists Some ganesha packages do not create ganesha log directories while it's expected to be created while changing it's permissions. Additionally it's no much sense in doing that as a separate task, so directory is created as correct permissions are set with creation of the rest required directories. Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com>	2020-01-24 11:10:08 -05:00
Guillaume Abrioux	eb9112d8fb	handler: read container_exec_cmd value from first mon Given that we delegate to the first monitor, we must read the value of `container_exec_cmd` from this node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-23 11:35:57 -05:00
Vytenis Sabaliauskas	ed1eaa1f38	ceph-facts: Fix for 'running_mon is undefined' error, so that fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0' Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com>	2020-01-23 16:27:11 +01:00
Dimitri Savineau	671b1aba3c	site-container: don't skip ceph-container-common On HCI environment the OSD and Client nodes are collocated. Because we aren't running the ceph-container-common role on the client nodes except the first one (for keyring purpose) then the ceph-role execution fails due to undefined variables. Closes: #4970 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794195 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-23 15:29:44 +01:00
Guillaume Abrioux	e5812fe45b	rolling_update: support upgrading 3.x + ceph-metrics on a dedicated node When upgrading from RHCS 3.x where ceph-metrics was deployed on a dedicated node to RHCS 4.0, it fails like following: ``` fatal: [magna005]: FAILED! => changed=false gid: 0 group: root mode: '0755' msg: 'chown failed: failed to look up user ceph' owner: root path: /etc/ceph secontext: unconfined_u:object_r:etc_t:s0 size: 4096 state: directory uid: 0 ``` because we are trying to run `ceph-config` on this node, it doesn't make sense so we should simply run this play on all groups except `[grafana-server]`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1793885 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-22 11:29:36 -05:00
Dimitri Savineau	bb3eae0c80	filestore-to-bluestore: fix osd_auto_discovery When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-22 09:36:09 +01:00
Guillaume Abrioux	483adb5d79	common: add a default value for ceph_directories_mode Since this variable makes it possible to customize the mode for ceph directories, let's make it a bit more explicit by adding a default value in ceph-defaults. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-22 09:35:35 +01:00
Dimitri Savineau	f995b079a6	filestore-to-bluestore: --destroy with raw devices We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-21 11:37:39 -05:00
Dimitri Savineau	c9e1fe3d92	ceph-osd: set container objectstore env variables Because we need to manage legacy ceph-disk based OSD with ceph-volume then we need a way to know the osd_objectstore in the container. This was done like this previously with ceph-disk so we should also do it with ceph-volume. Note that this won't have any impact for ceph-volume lvm based OSD. Rename docker_env_args fact to container_env_args and move the container condition on the include_tasks call. Remove OSD_DMCRYPT env variable from the ceph-osd template because it's now included in the container_env_args variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792122 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-20 13:59:44 -05:00
Benoît Knecht	3842aa1a30	ceph-rgw: Fix customize pool size "when" condition In `3c31b19ab3`, I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-01-20 09:26:53 -05:00
Guillaume Abrioux	22865cde9c	handler: fix call to container_exec_cmd in handler_osds When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-20 09:25:56 -05:00
Dmitriy Rabotyagov	2478a7b948	Fix undefined running_mon Since commit [1] running_mon introduced, it can be not defined which results in fatal error [2]. This patch defines default value which was used before patch [1] Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com> [1] `8dcbcecd71` [2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011	2020-01-16 17:03:25 -05:00
Dmitriy Rabotyagov	c81a213a6d	Fix application for openstack_cephfs pools RBD is invalid application for cephfs pools, so it was change to cephfs. Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com>	2020-01-16 16:27:53 -05:00
Dimitri Savineau	7f997e623a	ceph-facts: move facts to defaults value There's no need to define a variable via a fact if we can do it via a default value. Using a fact could be interesseting to override the default value on some condition. - ceph_uid could be set to 167 by default because it's only different on non containerized deployment on Debian/Ubuntu. - rbd_client_directory_{owner,group,mode} could be set to ceph,ceph,0770 by default install of null as we are doing in the facts. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-16 13:57:11 -05:00
Dimitri Savineau	e790b0851d	group_vars: remove useless files Delete legacy files that aren't used anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-16 13:53:12 -05:00
Guillaume Abrioux	3e262e072b	containers: use --cpus instead --cpu-quota When using docker 1.13.1, the current condition: ``` {% if (container_binary == 'docker' and ceph_docker_version.split('.')[0] is version_compare('13', '>=')) or container_binary == 'podman' -%} ``` is wrong because it compares the first digit (1) whereas it should compare the second one. It means we always use `--cpu-quota` although documentation recommend using `--cpus` when docker version is 1.13.1 or higher. From the doc: > --cpu-quota=<value> Impose a CPU CFS quota on the container. The number of > microseconds per --cpu-period that the container is limited to before > throttled. As such acting as the effective ceiling. > If you use Docker 1.13 or higher, use --cpus instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-16 13:51:43 -05:00
Guillaume Abrioux	8dcbcecd71	remove container_exec_cmd_mgr fact Iterating over all monitors in order to delegate a ` {{ container_binary }}` fails when collocating mgrs with mons, because ceph-facts reset `container_exec_cmd` to point to the first member of the monitor group. The idea is to force `container_exec_cmd` to be reset in ceph-mgr. This commit also removes the `container_exec_cmd_mgr` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-15 14:03:49 -05:00
Dimitri Savineau	3cc7d5651c	tox: use vagrant_up.sh instead of vagrant up We should use the same vagrant wrapper everywhere instead of the vagrant command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 17:52:35 +01:00
Dimitri Savineau	a5385e1048	vagrant: temp workaround for CentOS 8 cloud image The CentOS cloud infrastructure storing the vagrant CentOS 8 image changed the directory path and remove the old 8.0 image so the vagrant box add centos/8 fails returning a 404 http error. As a workaround we can pull the image from CentOS instead of letting vagrant doing the resolution. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 17:52:35 +01:00
Dimitri Savineau	4e7fb5d45a	drop use_fqdn variables This has been deprecated in the previous releases. Let's drop it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 11:32:39 +01:00
Dimitri Savineau	c61db12c09	travis: drop python2 support Since python2 is EOL we can drop it from travis CI matrix. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 10:42:51 +01:00
Guillaume Abrioux	3d0898aa5d	shrink-mds: fix condition on fs deletion the new ceph status registered in `ceph_status` will report `fsmap.up` = 0 when it's the last mds given that it's done after we shrink the mds, it means the condition is wrong. Also adding a condition so we don't try to delete the fs if a standby node is going to rejoin the cluster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787543 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-15 10:40:14 +01:00
Dimitri Savineau	bd87d69183	ceph-iscsi: don't use bracket with trusted_ip_list The trusted_ip_list parameter for the rbd-target-api service doesn't support ipv6 address with bracket. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-14 11:32:36 -05:00
Guillaume Abrioux	5558664f37	osd: use _devices fact in lvm batch scenario since `fd1718f379`, we must use `_devices` when deploying with lvm batch scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-14 09:12:03 -05:00
Guillaume Abrioux	d853da2a68	update: remove legacy This task is a code duplicate, probably a legacy, let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 15:18:45 -05:00
Guillaume Abrioux	2592a1e1e8	facts: fix osp/ceph external use case `d6da508a9b` broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 12:06:06 -05:00
Dimitri Savineau	f940e695ab	ceph-facts: move grafana fact to dedicated file We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-13 12:05:57 -05:00
Guillaume Abrioux	58e6bfed2d	osd: ensure osd ids collected are well restarted This commit refact the condition in the loop of that task so all potential osd ids found are well started. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 12:05:48 -05:00
Guillaume Abrioux	af6875706a	osd: do not run openstack_config during upgrade There is no need to run this part of the playbook when upgrading the cluter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 09:59:08 -05:00
Guillaume Abrioux	fef1cd4c4b	tests: use main playbook for add_osds job This commit replaces the playbook used for add_osds job given accordingly to the add-osd.yml playbook removal Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 09:59:08 -05:00
Guillaume Abrioux	3496a0efa2	osd: support scaling up using --limit This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 09:59:08 -05:00
Dimitri Savineau	3900527e16	tests/setup: update mount options on EL 8 The nobarrier mount flag doesn't exist anymoer on XFS in the EL 8 kernel. That's why the task wasn't working on those systems. We can still use the other options instead of skipping the task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-11 05:33:01 +01:00
Dimitri Savineau	e4ddcb812b	ceph-validate: fail on CentOS 7 The Ceph Octopus release is only supported on CentOS 8 Closes: #4918 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-10 14:06:02 -05:00
Guillaume Abrioux	dc672e86ec	tests: add a docker2podman scenario This commit adds a new scenario in order to test docker-to-podman.yml migration playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Guillaume Abrioux	b0c491800a	docker2podman: use set_fact to override variables play vars have lower precedence than role vars and `set_fact`. We must use a `set_fact` to reset these variables. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Guillaume Abrioux	1c2ec9fb40	docker2podman: force systemd to reload config This is needed after a change is made in systemd unit files. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Guillaume Abrioux	d746575fd0	docker2podman: install podman This commit adds a package installation task in order to install podman during the docker-to-podman.yml migration playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Dimitri Savineau	a09d1c38bf	purge-iscsi-gateways: don't run all ceph-facts We only need to have the container_binary fact. Because we're not gathering the facts from all nodes then the purge fails trying to get one of the grafana fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786686 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-10 15:46:15 +01:00
Guillaume Abrioux	fd1718f379	config: exclude ceph-disk prepared osds in lvm batch report We must exclude the devices already used and prepared by ceph-disk when doing the lvm batch report. Otherwise it fails because ceph-volume complains about GPT header. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 00:04:22 +01:00
Dimitri Savineau	3f344fdefe	rolling_update: run registry auth before upgrading There's some tasks using the new container image during the rolling upgrade playbook that needs to execute the registry login first otherwise the nodes won't be able to pull the container image. Unable to find image 'xxx.io/foo/bar:latest' locally Trying to pull repository xxx.io/foo/bar ... /usr/bin/docker-current: Get https://xxx.io/v2/foo/bar/manifests/latest: unauthorized Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-09 16:14:33 -05:00
Dimitri Savineau	747555dfa6	shrink-rgw: refact global workflow Instead of running the ceph roles against localhost we should do it on the first mon. The ansible and inventory hostname of the rgw nodes could be different. Ensure that the rgw instance to remove is present in the cluster. Fix rgw service and directory path. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-09 19:02:17 +01:00
Guillaume Abrioux	86f3eeb717	mon: support replacing a mon We must pick up a mon which actually exists in ceph-facts in order to detect if a cluster is running. Otherwise, it will state no cluster is already running which will end up deploying a new monitor isolated in a new quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-09 12:59:12 -05:00
Guillaume Abrioux	30200802d9	handler: fix bug `411bd07d54` introduced a bug in handlers using `handler__status` instead of `hostvars[item]['handler__status']` causes handlers to be triggered in anycase even though `handler_*_status` was set to `False` on a specific node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-08 17:11:42 -05:00

... 7 8 9 10 11 ...

5472 Commits (3d3ce263274d648f8fb376716f52b8b91b6f1313) All Branches Search

5472 Commits (3d3ce263274d648f8fb376716f52b8b91b6f1313)

All Branches