ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Noah Watkins	75c9130865	Avoid using tests as filter Fixes the deprecation warning: [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result\|search` use `result is search`. Signed-off-by: Noah Watkins <nwatkins@redhat.com> (cherry picked from commit `306e308f13`)	2018-10-16 14:35:08 +02:00
Guillaume Abrioux	904a0a4017	fail if fqdn deployment attempted fqdn configuration possibility caused a lot of trouble, it's adding a lot of complexity because of multiple cases and the relation between ceph-ansible and ceph-container. Moreover, there is no benefit for such a feature. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613155 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-13 18:55:06 +02:00
Andrew Schoen	53dfd050c5	ceph-defaults: add the nautilus 14.x entry to ceph_release_num The first 14.x tag has been cut so this needs to be added so that version detection will still work on the master branch of ceph. Fixes: https://github.com/ceph/ceph-ansible/issues/2671 Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `c2423e2c48`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 19:21:59 +02:00
Andy McCrae	d142be0422	Move apt cache update to individual task per role The apt-cache update can fail due to transient issues related to the action being a network operation. To reduce the impact of these transient failures this patch adds a retry to the update_cache task. However, the apt_repository tasks which would perform an apt_update won't retry the apt_update on a failure in the same way, as such this PR moves the apt_update into an individual task, once per role. Finally, the apt_repository tasks no longer have a changed_when: false, and the apt_cache update is only performed once per role, if the repositories change. Otherwise the cache is updated on the "apt" install tasks if the cache_timeout has been reached.	2018-05-03 14:02:15 +02:00
Guillaume Abrioux	cf27c5e941	move selinux check to `ceph-defaults` This check is alone in `ceph-docker-common` since a previous code refactor. Moving this check in `ceph-defaults` allows us to run `ceph-clients` without having to run `ceph-docker-common` even in non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Paul Bourke	463b5c6b22	Remove redundant task to check if atomic This fact is already set in site-docker.yml so there's no need to check it again in ceph-docker-common Signed-off-by: Paul Bourke <paul.bourke@oracle.com>	2018-02-19 10:10:46 +01:00
Sébastien Han	d47d02a5eb	docker-common: fix container restart on new image We now look for any excisting containers, if any we compare their running image with the latest pulled container image. For OSDs, we iterate over the list of running OSDs, this handles the case where the first OSD of the list has been updated (runs the new image) and not the others. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526513 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Sébastien Han	ebc195487c	default: remove duplicate code This is already defined in ceph-defaults. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Guillaume Abrioux	b29a42cba6	handlers: avoid duplicate handler Having handlers in both ceph-defaults and ceph-docker-common roles can make the playbook restarting two times services. Handlers can be triggered first time because of a change in ceph.conf and a second time because a new image has been pulled. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-10 16:46:42 +01:00
Sébastien Han	8a19a83354	container: restart container when there is a new image This wasn't any good choice to implement this. We had several options and none of them were ideal since handlers can not be triggered cross-roles. We could have achieved that by doing: * option 1 was to add a dependancy in the meta of the ceph-docker-common role. We had that long ago and we decided to stop so everything is managed via site.yml * option 2 was to import files from another role. This is messy and we don't that anywhere in the current code base. We will continue to do so. There is option 3 where we pull the image from the ceph-config role. This is not suitable as well since the docker command won't be available unless you run Atomic distro. This would also mean that you're trying to pull twice. First time in ceph-config, second time in ceph-docker-common The only option I came up with was to duplicate a bit of the ceph-config handlers code. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526513 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-10 16:46:42 +01:00
Sébastien Han	c2e04623a5	container: change the way we force no logs inside the container Previously we were using ceph_conf_overrides however this doesn't play nice for softwares like TripleO that uses ceph_conf_overrides inside its own code. For now, and since this is the only occurence of this, we can ensure no logs through the ceph conf template. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1532619 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-10 16:21:47 +01:00
Christian Berendt	50a848dc40	Rename fact docker_version to ceph_docker_version The name docker_version is very generic and is also used by other roles. As a result, there may be name conflicts. To avoid this a ceph_ prefix should be used for this fact. Since it is an internal fact renaming is not a problem.	2017-12-15 20:12:21 +01:00
Markos Chandras	f8e3d4bb76	ceph-docker-common: Add support for openSUSE Leap distributions Add support for the openSUSE Leap distributions. Signed-off-by: Markos Chandras <mchandras@suse.de>	2017-11-14 10:51:23 +00:00
Sébastien Han	d4ed9a2064	osd: enhance backward compatibility During the initial implementation of this 'old' thing we were falling into this issue without noticing https://github.com/moby/moby/issues/30341 and where blindly using --rm, now this is fixed the prepare container disappears and thus activation fail. I'm fixing this for old jewel images. Also this fixes the machine reboot case where the docker logs are purgend. In the old scenario, we now store the log locally in the same directory as the ceph-osd-run.sh script. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-03 11:15:23 +01:00
Sébastien Han	d6a0d2f9be	Merge pull request #2071 from jtaleric/master Docker image pull retry	2017-10-27 09:49:03 +02:00
Joe Talerico	ab58764288	Docker image pull retry This change sets a default timeout of 300s for the image pull. If the image pull times out (300s), we will retry 3 times by default. fixes 1954	2017-10-25 13:37:10 -04:00
Major Hayden	f73232caa4	Use check_mode instead of always_run This patch changes the `always_run: yes` task option to `check_mode: no` to avoid Ansible warnings.	2017-10-25 09:53:34 -05:00
Major Hayden	c2b5118c1b	Revert "Avoid deprecated always_run" This reverts commit `620fb37dd4`.	2017-10-25 09:48:09 -05:00
Sébastien Han	4413511b66	all: backward compatibility between stable-2.2 and 3.0 stable-3.0 brought numerous changes in ceph-ansible variables, this PR aims to maintain backward compatibility for someone running stable-2.2 upgrading to stable-3.0 but keeps its groups_vars untouched. We will then determine the right options to make sure the upgrade works but we are expecting that new variables should be used. We will drop this in a near future, maybe 3.1 or 3.2. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-20 11:54:10 +02:00
Christian Berendt	4c380c9ef8	Cleanup readme files in roles directories The contents of the README files are no longer up to date. Documentation for all roles is located below the docs directory.	2017-10-17 11:22:06 +02:00
Sébastien Han	b34a04ea41	site-docker.yml try to fetch images in // The container deployment is serialized, adding this task as a best effort. If docker is already present we pull the image otherwise we wait for the role to play. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-13 11:24:40 +02:00
Sébastien Han	9c3d749f7c	Merge pull request #2038 from major/fix-cmd-warning Suppress yum/dnf/rpm command warnings	2017-10-12 18:46:52 +02:00
Major Hayden	33b200d43a	Suppress yum/dnf/rpm command warnings Ansible throws warnings when using yum/dnf/rpm with the command module: [WARNING]: Consider using yum module rather than running yum This patch adds the `warn: no` argument to suppress the warnings in the Ansible output.	2017-10-12 08:38:05 -05:00
Major Hayden	620fb37dd4	Avoid deprecated always_run The `always_run` key is deprecated and being removed in Ansible 2.4. Using it causes a warning to be displayed: [DEPRECATION WARNING]: always_run is deprecated. This patch changes all instances of `always_run` to use the `always` tag, which causes the task to run each time the playbook runs.	2017-10-12 08:29:44 -05:00
Guillaume Abrioux	70e2787fe2	docker: fix keyrings copied on all nodes All keyring are getting copied to all nodes. This commit fixes a leftover from a previous code refactor. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498583 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 09:23:22 +02:00
Guillaume Abrioux	d20dc54202	docker-common: fix wrong syntax there is no need to backslash the quotes here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-28 00:30:08 +02:00
Guillaume Abrioux	1886a69b8b	docker-common: refact `stat_ceph_files.yml` there is no need to build the `ceph_config_keys` fact in several steps for rbd-mirror keyring. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Guillaume Abrioux	295c1b0610	docker-common: fix ceph_health check `docker ps` will always return `0` (see: https://github.com/docker/cli/issues/538). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Sébastien Han	d100b4e596	name includes and set_fact for clarity When Ansible is not run with verbose options it's difficult to see which include and/or set_fact does what. So adding a name for each clarifies. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 23:39:46 +02:00
Sébastien Han	aa5d94fc87	docker-common: re-introduce state for leftover files The variable "statleftover" was removed by commit `a60c74f61e` and never added back to the new playbook, yet it is still being referenced. Adding it back Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492224 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 15:01:32 +02:00
Guillaume Abrioux	0f506f4f0a	Docker: split the task 'copy ceph configs&keys' All keys are copied to all nodes. This commit split that task in each roles so keys are copied to their respective nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-11 21:14:13 +02:00
Sébastien Han	cf88c136f5	Merge pull request #1859 from ceph/container-limit container: introduce resource limitation for containers	2017-09-07 12:51:34 +02:00
Sébastien Han	2fa151b9e8	container: introduce resource limitation for containers This can be controlled via 2 options: * ceph_$DAEMON_docker_memory_limit * ceph_$DAEMON_docker_cpu_limit All daemons default to 1GB for memory and 1 CPU by default. Recommendations from: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-06 14:52:21 +02:00
Sébastien Han	b7db600caa	switch-from-non-containerized-to-containerized: mask unit files We must mask the image so we are sure that even if the system reboots then the OSDs won't start. Also remove Ceph udev rules if found on the system prior to deploy containers. If we don't do this we are exposed to conflicts between udev rules and sytemd unit files. Also add the CI will now test the migration from a non-containerized cluster to a containerized cluster. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-05 15:20:31 +02:00
Sébastien Han	5ed1a91aeb	Merge pull request #1819 from ceph/no-container-log ceph-docker-common: do not log inside the container	2017-09-05 11:47:11 +02:00
Sébastien Han	b05271f464	Merge pull request #1724 from ceph/container-multi-journal osd: allow multi dedicated journals for containers	2017-08-30 17:41:42 +02:00
Sébastien Han	a60c74f61e	ceph-docker-common: re-organize stat ceph file Use a single file to run the checks instead of duplicating code. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 14:44:34 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Sébastien Han	5743916092	common: add mimic release facts Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-29 17:21:37 +02:00
Sébastien Han	fa9f2313d5	Merge pull request #1822 from ceph/rhcs-container-release ceph-docker-common: detect ceph version	2017-08-29 12:16:20 +02:00
Sébastien Han	cfddd2903c	ceph-docker-common: fix empty array The list can not be evaluated properly if it containers '[]', which is the case when using the filter "default([])". To fix this, we have to properly merge the lists. This is fixing the issue: "list object has no element 1" Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-29 10:25:46 +02:00
Sébastien Han	764e697186	ceph-docker-common: detect ceph version By detecting the ceph version running in the container we can easily apply conditions like: ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous We do that already, in ceph-docker-common/tasks/fetch_configs.yml. This fixes the error: TASK [ceph-docker-common : register rbd bootstrap key] ****************************************************** fatal: [magna005]: FAILED! => {"failed": true, "msg": "The conditional check 'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous): 'dict object' has no attribute 'dummy'\n\nThe error appears to have been in '/home/ubuntu/ceph-ansible/roles/ceph-docker-common/tasks/fetch_configs.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: register rbd bootstrap key\n ^ here\n"} Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1486062 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-28 23:28:47 +02:00
Sébastien Han	aa69c2c007	ceph-docker-common: do not log inside the container Logging inside the container is not useful since it writes to the overlayfs partition, resulting in potential performance degradation on the container. If you need to check the logs, just look at journald. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-28 12:04:49 +02:00
Sébastien Han	972eb45d31	ceph-docker-common: apply 0600 to key permissions Keys should only be readable and writable by their respective owners and that's all. Closes: https://github.com/ceph/ceph-ansible/issues/1760 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-25 18:14:28 +02:00
Sébastien Han	1f4082f200	update meta for ansible galaxy Closes: https://github.com/ceph/ceph-ansible/issues/1637 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-25 00:05:44 +02:00
Guillaume Abrioux	539197a2fc	Introduce new role ceph-config. This will give us more flexibility and the possibility to deploy a client node for an external ceph-cluster. related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1469426 Fixes: #1670 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-24 11:33:03 +02:00
Jason Dillaman	70c2b934ca	distribute rbd bootstrap key if available Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-08-22 18:55:29 -04:00
Guillaume Abrioux	608bad901d	docker-common: Fix bug when updating config in containerized deployment, if you try to update your `ceph.conf` file it won't be actually updated on your nodes because it is overwritten by the copy of the file which is present in your fetch directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00

1 2

68 Commits (143cdd731a9ee38f8279e55a7614deb5d101b123)