ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	fd759f97fa	dashboard: disable facts gathering This is already done in the main playbooks but absent in the dashboard playbook. The facts are already gathered during the first play of the main playbooks so we don't need to doing twice. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5ae7304ace`)	2019-10-14 09:45:11 +02:00
Guillaume Abrioux	ebfe7f31ed	dashboard: if no host is available, let's just skip these plays. If there is no host available, let's just skip these plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1759917 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0b245bd007`)	2019-10-09 14:47:36 -04:00
Dimitri Savineau	5f91be8740	switch_to_containers: umount osd lockbox partition When switching from a baremetal deployment to a containerized deployment we only umount the OSD data partition. If the OSD is encrypted (dmcrypt: true) then there's an additional partition (part number 5) used for the lockbox and mount in the /var/lib/ceph/osd-lockbox/ directory. Because this partition isn't umount then the containerized OSD aren't able to start. The partition is still mount by the system and can't be remount from the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `19edf707a5`)	2019-10-08 00:57:05 +00:00
Guillaume Abrioux	b325cc386e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fa9b42e98e`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	468aa5d63b	switch_to_containers: optimize ownership change As per https://github.com/ceph/ceph-ansible/pull/4323#issuecomment-538420164 using `find` command should be faster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757400 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-Authored-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `c5d0c90bb7`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	37fd0b179b	update: import ceph-defaults role in first play Typical error: ``` fatal: [mon0]: FAILED! => msg: \|- The conditional check 'not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])' failed. The error was: error while evaluating conditional (not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])): 'client_group_name' is undefined ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8138d4193c`)	2019-10-07 11:21:23 +02:00
Guillaume Abrioux	9a4fcfabe1	main: exclude client nodes from facts gathering when delegate_facts_host This commit excludes client nodes from facts gathering, they are not needed and can speed up this task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `865d2eac9b`)	2019-10-07 11:21:23 +02:00
Dimitri Savineau	ec1c57f690	dashboard: remove useless block section The block section were used with the dashboard_enabled condition when the code was included in the main playbooks. Because this condition isn't present in the dashboard playbook anymore we can remove the block section. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf47594b47`)	2019-10-04 13:28:37 +02:00
Guillaume Abrioux	9a79ed1bf0	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e08194dd67`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	7f902994b3	rbdmirror: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c69816c6b7`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	d7a06c67db	iscsigw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/ directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4636f3f7e2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	b564c37696	upgrade: add an infra playbook to migrate systemd units to podman this commit adds a new playbook to force systemd units for containers to use podman instead of docker. This is needed in the rhel8 upgrade context so after the base OS is upgraded containers can be started using podman. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f2017dcda2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	4afe1b748c	update: reset mon_host after mons upgrade after all mon are upgraded, let's reset mon_host which is used in the rest of the playbook for setting `container_exec_cmd` so we are sure to use the right value. Typical error: ``` failed: [mds0 -> mon0] (item={u'path': u'/var/lib/ceph/bootstrap-mds/ceph.keyring', u'name': u'client.bootstrap-mds', u'copy_key': True}) => changed=true ansible_loop_var: item cmd: - docker - exec - ceph-mon-mon2 - ceph - --cluster - ceph - auth - get - client.bootstrap-mds delta: '0:00:00.016294' end: '2019-09-27 13:54:58.828835' item: copy_key: true name: client.bootstrap-mds path: /var/lib/ceph/bootstrap-mds/ceph.keyring msg: non-zero return code rc: 1 start: '2019-09-27 13:54:58.812541' stderr: 'Error response from daemon: No such container: ceph-mon-mon2' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d84160a170`)	2019-09-28 09:01:16 +02:00
Harald Jensås	5fea830414	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `e695efcaf7`)	2019-09-27 17:49:46 +02:00
Sam Choraria	7594bc9181	rolling_update.yml: force ceph-volume scan on osds The rolling_update.yml playbook fails when scanning ceph-disk osds while deploying nautilus. The --force flag is required to scan existing osds and rewrite their json metadata. Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk> (cherry picked from commit `7cc9f93680`)	2019-09-26 14:51:59 -04:00
Guillaume Abrioux	96dafd676c	infrastructure-playbooks: add filestore-to-bluestore.yml This playbook helps to migrate all osds on a node from filestore to bluestore backend. Note that ALL osd on the specified osd nodes will be shrinked and redeployed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3f9ccdaa8a`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	26e0f4db97	lv-create: fix a typo This commit fixes a typo. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c785ad3637`)	2019-09-26 16:21:54 +02:00
Mehdy	8c37894109	shrink-rgw.yml: fix confirmation play's name the confirmation play's name should confirm removing rgw instead of monitor Signed-off-by: Mehdy Khoshnoody <mehdy.khoshnoody@gmail.com> (cherry picked from commit `9fa98d79fd`)	2019-09-25 16:37:44 +02:00
Dimitri Savineau	a5775be7c4	shrink-mon: search mon in the quorum_names list If we're looking at the mon hostname in the ceph status output then there's some scenarios where this could be true. If we collocate some services (mons, mgrs, etc..) then the hostname of the monitor to shrink will still be present in the ceph status (like in mgrs or other). Instead we should check the hostame only in the mon part of the output. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `734c0dc310`)	2019-09-18 14:47:40 +00:00
Kevin Jones	3a8de9cc36	Set proper ownership command performance improvement By changing the set ownership command from using the file module in combination with a with_items loop to a raw chown command, we can achieve a 98% performance increase here. On a ceph cluster with a significant amount of directories and files in /var/lib/ceph, the file module has to run checks on ownership of all those directories and files to determine whether a change is needed. In this case, we just want to explicitly set the ownership of all these directories and files to the ceph_uid Added context note to all set proper ownership tasks Signed-off-by: Kevin Jones <kevinjones@redhat.com> (cherry picked from commit `47bf47c9d8`)	2019-08-22 12:59:58 +02:00
Guillaume Abrioux	236020fb2b	shrink-mon: refact 'verify the monitor is out of the cluster' task use `from_json` filter instead of a `\| python` so we can get rid of the `shell` module usage here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5573f17e76`)	2019-08-19 18:47:14 +00:00
Rishabh Dave	b28ed96378	use pre_tasks and post_tasks in shrink-mon.yml too This commit should've been part of commit `2fb12ae554`. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `2034387f57`)	2019-08-19 18:47:14 +00:00
Guillaume Abrioux	2f77704591	common: use discovered_interpreter_python fact in order to use the right binary name when using python cli in command or shell module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13815ad3ca`)	2019-08-19 18:47:14 +00:00
Dimitri Savineau	f9d9ffac8f	dashboard: run dashboard role on mgr/mon nodes We don't need to execute the ceph-dashboard role on the nodes present in the grafana-server group. This one is dedicated to the grafana and prometheus stack. The ceph-dashboard needs to executed where the ceph-mgr is running. It is either on the dedicated mgr nodes or if mgr and mon are collocated implicitly on the mon nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `16939eff9e`)	2019-08-08 13:47:09 +02:00
Rishabh Dave	72a062b6fa	add a playbook the remove rgw from a given node Add a playbook named shrink-rgw.yml to infrastructure-playbooks/ that can remove a RGW from a node in an already deployed Ceph cluster. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `632a44bdf2`)	2019-07-31 15:25:15 -04:00
Rishabh Dave	8ca88b41cc	infra-playbooks: rewite a condition for better readability Use facility built-in in Ansible to check whether a command was executed successfully rather looking at its return value. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `5aecdd3ba6`)	2019-07-29 15:52:29 +02:00
Guillaume Abrioux	d0ad1cf0f1	dashboard: use dedicated group only There's no need to add complexity and trying to fallback on other group. Let's deploy dashboard on all nodes present in grafana-server group. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d67230b2a2`)	2019-07-29 15:46:58 +02:00
Dimitri Savineau	dd87db70ca	dashboard: move code into a dedicated playbook Move dashboard, grafana/prometheus and node-exporter plays into a dedicated playbook in infrastructure-playbook directory. To avoid using 'dashboard_enabled \| bool' condition multiple time in the main playbook we can just import the dashboard playbook or not. This patch also allows to use an unique dashboard playbook for both baremetal and container playbooks. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `43135840b1`)	2019-07-29 15:46:58 +02:00
Dimitri Savineau	43d625b59a	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `07c6695d16`)	2019-07-26 16:23:41 -04:00
Guillaume Abrioux	bee8a31afe	shrink-rbdmirror: check if rbdmirror is well removed from cluster This commits adds a check to ensure the daemon has been removed from the cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `916dc1f52f`)	2019-07-16 15:02:49 +02:00
Rishabh Dave	0a15d1d112	add a playbook that removes rbd-mirror from a node Add a playbook named "shrink-rbdmirror.yml" in infrastructure-playbooks/ that removes a RBD Mirror from a node in an already deployed Ceph cluster. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `c4824acb19`)	2019-07-16 15:02:49 +02:00
Rishabh Dave	6197d1c8d9	add a playbook that removes manager from a node Add a playbook, named "shrink-mgr.yml", in infrastructure-playbooks/ that removes a MGR from a node in an already deployed Ceph cluster. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `f4ea75051b`)	2019-07-09 15:00:56 +00:00
Guillaume Abrioux	85a448429d	shrink-mds: refact post tasks This commit refacts the way we check the "mds_to_kill" node is well stopped. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `7df62fde34`)	2019-07-09 12:07:47 +02:00
Rishabh Dave	38c2785e95	add a playbook that removes mds from a node Add a playbook, named "shrink-mds.yml", in infrastructure-playbooks/ that removes a MDS from a node in an already deployed Ceph cluster. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `235b1fccc6`)	2019-07-09 12:07:47 +02:00
Mike Christie	cf6050d4e6	igw: Support new ceph-iscsi package during purge The ceph-iscsi-config and ceph-iscsi-cli packages were combined into ceph-iscsi and its APIs changed. This fixes up the iscsi purge task to support the new API and old one. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `b163206db7`)	2019-07-04 00:04:04 +00:00
Guillaume Abrioux	0a0cdc0963	purge: ensure no ceph kernel thread is present This tries to first unmount any cephfs/nfs-ganesha mount point on client nodes, then unmap any mapped rbd devices and finally it tries to remove ceph kernel modules. If it fails it means some resources are still busy and should be cleaned manually before continuing to purge the cluster. This is done early in the playbook so the cluster stays untouched until everything is ready for that operation, otherwise if you try to redeploy a cluster it could end up by getting confused by leftover from previous deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1337915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `20e4852888`)	2019-06-24 13:20:50 +02:00
Guillaume Abrioux	77d24203fa	upgrade: accept HEALTH_OK and HEALTH_WARN as valid state `3a100cfa52` introduced a check which is a bit too restrictive, let's accept HEALTH_OK and HEALTH_WARN. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6dce51183b`)	2019-06-21 15:47:33 +00:00
Dimitri Savineau	aa197f77fc	remove ceph restapi references The ceph restapi configuration was only available until Luminous release so we don't need those leftovers for nautilus+. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da8b7ab7fb`)	2019-06-20 15:15:10 -04:00
Guillaume Abrioux	b93064c7c8	rolling_update: fail early if cluster state is not OK starting an upgrade if the cluster isn't HEALTH_OK isn't a good idea. Let's check for the cluster status before trying to upgrade. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3a100cfa52`)	2019-06-19 08:41:25 +00:00
Guillaume Abrioux	53dd58e84c	rolling_update: only mask and stop unit in mgr part Otherwise it fails like following: ``` fatal: [mon0]: FAILED! => changed=false msg: \|- Unable to enable service ceph-mgr@mon0: Failed to execute operation: Cannot send after transport endpoint shutdown ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `51b2813e04`)	2019-06-19 08:41:25 +00:00
Dimitri Savineau	6e565b251d	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7503098ca0`)	2019-06-17 15:56:00 -04:00
L3D	1daca1ba83	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de> (cherry picked from commit `ab54fe20ec`)	2019-06-07 16:05:51 +02:00
Dimitri Savineau	7a384e7ec2	purge-cluster: clean all ceph repo files We currently only purge rh_storage yum repository file but depending on the ceph_repository value we are using, the ceph repository file could have a different name. Resolves: #4056 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `44c63903ca`)	2019-06-07 12:05:40 +00:00
guihecheng	a6312ba9bc	Add section for purging rgw loadbalancer in purge-cluster.yml Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com> (cherry picked from commit `59e702ec39`)	2019-06-06 19:44:30 +00:00
Guillaume Abrioux	16c6d530c6	roles: introduce `ceph-container-engine` role This commit splits the current `ceph-container-common` role. This introduces a new role `ceph-container-engine` which handles the tasks specific to the installation of containers tools (docker/podman). This is needed for the ceph-dashboard implementation for 2 main reasons: 1/ Since the ceph-dashboard stack is only containerized, we must install everything needed to run containers even in non containerized deployments. Splitting this role allows us to not have to call the full `ceph-container-common` role which would run a bunch of unneeded tasks that would have been skipped anyway. 2/ The current implementation would have required to run `ceph-container-common` on all ceph-clients nodes which would have been conflicting with `9d3517c670` (we don't want to run ceph-container-common on all client nodes, see mentioned commit for more details) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `55420d6253`)	2019-05-22 15:24:11 -04:00
Guillaume Abrioux	d83db2c8ed	switch to ansible 2.8 - remove private attribute with import_role. - update documentation. - update rpm spec requirement. - fix MagicMock python import in unit tests. Closes: #3765 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `72d8315299`)	2019-05-21 09:17:46 +02:00
Dimitri Savineau	023cdffd95	purge-docker-cluster: don't remove data on atomic Because we don't manage the docker service on atomic (yet) via the ceph-container-common role then we can't stop docker dans remove the data. For now let's do that only for non atomic hosts. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `638604929b`)	2019-05-17 10:44:52 -04:00
Guillaume Abrioux	e29fd842a6	rename docker_exec_cmd variable This commit renames the `docker_exec_cmd` variable to `container_exec_cmd` so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e74d80e72f`)	2019-05-17 16:05:58 +02:00
Zack Cerza	0496ce8e5c	purge-docker-cluster.yml: Default lvm_volumes We were failing when that variable is unset; purge-cluster.yml contains this workaround. Signed-off-by: Zack Cerza <zack@redhat.com> (cherry picked from commit `9b4339a2ba`)	2019-05-17 16:05:58 +02:00
Boris Ranto	5ac7559736	Merge cephmetrics/dashboard-ansible repo This commit will merge dashboard-ansible installation scripts with ceph-ansible. This includes several new roles to setup ceph-dashboard and the underlying technologies like prometheus and grafana server. Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com> Co-authored-by: Zack Cerza <zcerza@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f141a6e80`)	2019-05-17 16:05:58 +02:00

1 2 3 4 5 ...

485 Commits (3313bc5c1fc9c28faadc7ab2ed20274f8d1d6623)