ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	90f3f61548	infra: introduce docker to podman playbook This isn't backported from master because there are too many changes between stable-3.2 and other newer branches. NOTE: This playbook doesn't add podman support in stable-3.2 at all. This is a tripleO dedicated playbook which is intended to be run early during FFU workflow in order to prepare the OS upgrade. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1853457 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-07 12:11:09 -04:00
Guillaume Abrioux	c60967f045	docker-common: remove legacy tasks for ntp configuration Those tasks aren't needed in docker-common since the introduction of `ceph-infra` role. They are duplicated tasks. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810376 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cd0195c562`)	2020-03-25 13:53:25 -04:00
Guillaume Abrioux	726b3f220b	defaults: change monitor\|radosgw_address default values To avoid confusion, let's change the default value from `0.0.0.0` to `x.x.x.x`. Users might think setting `0.0.0.0` will make the daemon binding on all interfaces. Fixes: #4827 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc02fc98eb`)	2020-01-14 17:22:35 +01:00
Guillaume Abrioux	659f2c60b5	validate: change default value for `radosgw_address` change default value of `radosgw_address` to keep consistency with `monitor_address`. Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine if it has to run `check_eth_rgw.yml`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e4869ac8bd`)	2018-11-28 23:54:06 +01:00
Andy McCrae	3e0fa3bc18	Add ability to use a different client container Currently a throw-away container is built to run ceph client commands to setup users, pools & auth keys. This utilises the same base ceph container which has all the ceph services inside it. This PR allows the use of a separate container if the deployer wishes - but defaults to use the same full ceph container. This can be used for different architectures or distributions, which may support the the Ceph client, but not Ceph server, and allows the deployer to build and specify a separate client container if need be. Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>	2018-10-16 23:28:35 +00:00
Nan Li	55334baa0c	docker-ce is used in aarch64 instead of docker engine Signed-off-by: Nan Li <herbert.nan@linaro.org>	2018-10-15 18:38:40 +02:00
Noah Watkins	306e308f13	Avoid using tests as filter Fixes the deprecation warning: [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result\|search` use `result is search`. Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-10-10 04:26:33 +00:00
Rishabh Dave	380168dadc	don't use "include" to include tasks Use "import_tasks" or "include_tasks" instead. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-09-27 17:53:40 +02:00
Andrew Schoen	c2423e2c48	ceph-defaults: add the nautilus 14.x entry to ceph_release_num The first 14.x tag has been cut so this needs to be added so that version detection will still work on the master branch of ceph. Fixes: https://github.com/ceph/ceph-ansible/issues/2671 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-06-01 16:51:23 +02:00
Andy McCrae	d142be0422	Move apt cache update to individual task per role The apt-cache update can fail due to transient issues related to the action being a network operation. To reduce the impact of these transient failures this patch adds a retry to the update_cache task. However, the apt_repository tasks which would perform an apt_update won't retry the apt_update on a failure in the same way, as such this PR moves the apt_update into an individual task, once per role. Finally, the apt_repository tasks no longer have a changed_when: false, and the apt_cache update is only performed once per role, if the repositories change. Otherwise the cache is updated on the "apt" install tasks if the cache_timeout has been reached.	2018-05-03 14:02:15 +02:00
Guillaume Abrioux	cf27c5e941	move selinux check to `ceph-defaults` This check is alone in `ceph-docker-common` since a previous code refactor. Moving this check in `ceph-defaults` allows us to run `ceph-clients` without having to run `ceph-docker-common` even in non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Paul Bourke	463b5c6b22	Remove redundant task to check if atomic This fact is already set in site-docker.yml so there's no need to check it again in ceph-docker-common Signed-off-by: Paul Bourke <paul.bourke@oracle.com>	2018-02-19 10:10:46 +01:00
Sébastien Han	d47d02a5eb	docker-common: fix container restart on new image We now look for any excisting containers, if any we compare their running image with the latest pulled container image. For OSDs, we iterate over the list of running OSDs, this handles the case where the first OSD of the list has been updated (runs the new image) and not the others. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526513 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Sébastien Han	ebc195487c	default: remove duplicate code This is already defined in ceph-defaults. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Guillaume Abrioux	b29a42cba6	handlers: avoid duplicate handler Having handlers in both ceph-defaults and ceph-docker-common roles can make the playbook restarting two times services. Handlers can be triggered first time because of a change in ceph.conf and a second time because a new image has been pulled. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-10 16:46:42 +01:00
Sébastien Han	8a19a83354	container: restart container when there is a new image This wasn't any good choice to implement this. We had several options and none of them were ideal since handlers can not be triggered cross-roles. We could have achieved that by doing: * option 1 was to add a dependancy in the meta of the ceph-docker-common role. We had that long ago and we decided to stop so everything is managed via site.yml * option 2 was to import files from another role. This is messy and we don't that anywhere in the current code base. We will continue to do so. There is option 3 where we pull the image from the ceph-config role. This is not suitable as well since the docker command won't be available unless you run Atomic distro. This would also mean that you're trying to pull twice. First time in ceph-config, second time in ceph-docker-common The only option I came up with was to duplicate a bit of the ceph-config handlers code. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526513 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-10 16:46:42 +01:00
Christian Berendt	50a848dc40	Rename fact docker_version to ceph_docker_version The name docker_version is very generic and is also used by other roles. As a result, there may be name conflicts. To avoid this a ceph_ prefix should be used for this fact. Since it is an internal fact renaming is not a problem.	2017-12-15 20:12:21 +01:00
Markos Chandras	f8e3d4bb76	ceph-docker-common: Add support for openSUSE Leap distributions Add support for the openSUSE Leap distributions. Signed-off-by: Markos Chandras <mchandras@suse.de>	2017-11-14 10:51:23 +00:00
Sébastien Han	d4ed9a2064	osd: enhance backward compatibility During the initial implementation of this 'old' thing we were falling into this issue without noticing https://github.com/moby/moby/issues/30341 and where blindly using --rm, now this is fixed the prepare container disappears and thus activation fail. I'm fixing this for old jewel images. Also this fixes the machine reboot case where the docker logs are purgend. In the old scenario, we now store the log locally in the same directory as the ceph-osd-run.sh script. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-03 11:15:23 +01:00
Sébastien Han	d6a0d2f9be	Merge pull request #2071 from jtaleric/master Docker image pull retry	2017-10-27 09:49:03 +02:00
Joe Talerico	ab58764288	Docker image pull retry This change sets a default timeout of 300s for the image pull. If the image pull times out (300s), we will retry 3 times by default. fixes 1954	2017-10-25 13:37:10 -04:00
Major Hayden	f73232caa4	Use check_mode instead of always_run This patch changes the `always_run: yes` task option to `check_mode: no` to avoid Ansible warnings.	2017-10-25 09:53:34 -05:00
Major Hayden	c2b5118c1b	Revert "Avoid deprecated always_run" This reverts commit `620fb37dd4`.	2017-10-25 09:48:09 -05:00
Sébastien Han	4413511b66	all: backward compatibility between stable-2.2 and 3.0 stable-3.0 brought numerous changes in ceph-ansible variables, this PR aims to maintain backward compatibility for someone running stable-2.2 upgrading to stable-3.0 but keeps its groups_vars untouched. We will then determine the right options to make sure the upgrade works but we are expecting that new variables should be used. We will drop this in a near future, maybe 3.1 or 3.2. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-20 11:54:10 +02:00
Sébastien Han	b34a04ea41	site-docker.yml try to fetch images in // The container deployment is serialized, adding this task as a best effort. If docker is already present we pull the image otherwise we wait for the role to play. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-13 11:24:40 +02:00
Sébastien Han	9c3d749f7c	Merge pull request #2038 from major/fix-cmd-warning Suppress yum/dnf/rpm command warnings	2017-10-12 18:46:52 +02:00
Major Hayden	33b200d43a	Suppress yum/dnf/rpm command warnings Ansible throws warnings when using yum/dnf/rpm with the command module: [WARNING]: Consider using yum module rather than running yum This patch adds the `warn: no` argument to suppress the warnings in the Ansible output.	2017-10-12 08:38:05 -05:00
Major Hayden	620fb37dd4	Avoid deprecated always_run The `always_run` key is deprecated and being removed in Ansible 2.4. Using it causes a warning to be displayed: [DEPRECATION WARNING]: always_run is deprecated. This patch changes all instances of `always_run` to use the `always` tag, which causes the task to run each time the playbook runs.	2017-10-12 08:29:44 -05:00
Guillaume Abrioux	70e2787fe2	docker: fix keyrings copied on all nodes All keyring are getting copied to all nodes. This commit fixes a leftover from a previous code refactor. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498583 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 09:23:22 +02:00
Guillaume Abrioux	d20dc54202	docker-common: fix wrong syntax there is no need to backslash the quotes here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-28 00:30:08 +02:00
Guillaume Abrioux	1886a69b8b	docker-common: refact `stat_ceph_files.yml` there is no need to build the `ceph_config_keys` fact in several steps for rbd-mirror keyring. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Guillaume Abrioux	295c1b0610	docker-common: fix ceph_health check `docker ps` will always return `0` (see: https://github.com/docker/cli/issues/538). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Sébastien Han	d100b4e596	name includes and set_fact for clarity When Ansible is not run with verbose options it's difficult to see which include and/or set_fact does what. So adding a name for each clarifies. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 23:39:46 +02:00
Sébastien Han	aa5d94fc87	docker-common: re-introduce state for leftover files The variable "statleftover" was removed by commit `a60c74f61e` and never added back to the new playbook, yet it is still being referenced. Adding it back Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492224 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 15:01:32 +02:00
Guillaume Abrioux	0f506f4f0a	Docker: split the task 'copy ceph configs&keys' All keys are copied to all nodes. This commit split that task in each roles so keys are copied to their respective nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-11 21:14:13 +02:00
Sébastien Han	cf88c136f5	Merge pull request #1859 from ceph/container-limit container: introduce resource limitation for containers	2017-09-07 12:51:34 +02:00
Sébastien Han	2fa151b9e8	container: introduce resource limitation for containers This can be controlled via 2 options: * ceph_$DAEMON_docker_memory_limit * ceph_$DAEMON_docker_cpu_limit All daemons default to 1GB for memory and 1 CPU by default. Recommendations from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-06 14:52:21 +02:00
Sébastien Han	b7db600caa	switch-from-non-containerized-to-containerized: mask unit files We must mask the image so we are sure that even if the system reboots then the OSDs won't start. Also remove Ceph udev rules if found on the system prior to deploy containers. If we don't do this we are exposed to conflicts between udev rules and sytemd unit files. Also add the CI will now test the migration from a non-containerized cluster to a containerized cluster. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-05 15:20:31 +02:00
Sébastien Han	b05271f464	Merge pull request #1724 from ceph/container-multi-journal osd: allow multi dedicated journals for containers	2017-08-30 17:41:42 +02:00
Sébastien Han	a60c74f61e	ceph-docker-common: re-organize stat ceph file Use a single file to run the checks instead of duplicating code. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 14:44:34 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Sébastien Han	5743916092	common: add mimic release facts Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-29 17:21:37 +02:00
Sébastien Han	fa9f2313d5	Merge pull request #1822 from ceph/rhcs-container-release ceph-docker-common: detect ceph version	2017-08-29 12:16:20 +02:00
Sébastien Han	cfddd2903c	ceph-docker-common: fix empty array The list can not be evaluated properly if it containers '[]', which is the case when using the filter "default([])". To fix this, we have to properly merge the lists. This is fixing the issue: "list object has no element 1" Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-29 10:25:46 +02:00
Sébastien Han	764e697186	ceph-docker-common: detect ceph version By detecting the ceph version running in the container we can easily apply conditions like: ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous We do that already, in ceph-docker-common/tasks/fetch_configs.yml. This fixes the error: TASK [ceph-docker-common : register rbd bootstrap key] ****************************************************** fatal: [magna005]: FAILED! => {"failed": true, "msg": "The conditional check 'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous): 'dict object' has no attribute 'dummy'\n\nThe error appears to have been in '/home/ubuntu/ceph-ansible/roles/ceph-docker-common/tasks/fetch_configs.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: register rbd bootstrap key\n ^ here\n"} Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1486062 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-28 23:28:47 +02:00
Sébastien Han	972eb45d31	ceph-docker-common: apply 0600 to key permissions Keys should only be readable and writable by their respective owners and that's all. Closes: https://github.com/ceph/ceph-ansible/issues/1760 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-25 18:14:28 +02:00
Guillaume Abrioux	539197a2fc	Introduce new role ceph-config. This will give us more flexibility and the possibility to deploy a client node for an external ceph-cluster. related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1469426 Fixes: #1670 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-24 11:33:03 +02:00
Jason Dillaman	70c2b934ca	distribute rbd bootstrap key if available Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-08-22 18:55:29 -04:00
Guillaume Abrioux	608bad901d	docker-common: Fix bug when updating config in containerized deployment, if you try to update your `ceph.conf` file it won't be actually updated on your nodes because it is overwritten by the copy of the file which is present in your fetch directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00

1 2

61 Commits (8a154ae14a3eb24322b498a2afce19ea4d3672c0)