ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	899b0eb451	defaults: check only 1 time if there is a running cluster There is no need to check for a running cluster n*nodes time in `ceph-defaults` so let's add a `run_once: true` to save some resources and time. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-16 11:23:00 +02:00
Sébastien Han	82ccbdafbc	ceph-defaults: bring backward compatibility for old syntax If people keep on using the mon_cap, osd_cap etc the playbook will translate this old syntax on the flight. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	bb60f2fea4	ceph-defaults: fix ceoh_uid for container image tag latest According to our recent change, we now use "CentOS" as a latest container image. We need to reflect this on the ceph_uid. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-09 13:54:55 +02:00
Attila Fazekas	ecd3563c21	Deploying without managed monitors failed Tripleo deployment failed when the monitors not manged by tripleo itself with: FAILED! => {"msg": "list object has no element 0"} The failing play item was introduced by `f46217b69a` . fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1552327 Signed-off-by: Attila Fazekas <afazekas@redhat.com>	2018-04-04 18:16:46 +02:00
Guillaume Abrioux	dcf6a246a4	defaults: remove `run_once: true` when creating fetch_directory because of `serial: 1`, it can be an issue when the playbook is being run on client nodes. Since the refact of `ceph-client` we skip the role `ceph-defaults` on every node except the first client node, it means that the task is not going to be played because of `run_once: true`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Guillaume Abrioux	cf27c5e941	move selinux check to `ceph-defaults` This check is alone in `ceph-docker-common` since a previous code refactor. Moving this check in `ceph-defaults` allows us to run `ceph-clients` without having to run `ceph-docker-common` even in non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Andrew Schoen	6cffbd5409	ceph-defaults: set is_atomic variable This variable is needed for containerized clusters and is required for the ceph-docker-common role. Typically the is_atomic variable is set in site-docker.yml.sample though so if ceph-docker-common is used outside of that playbook it needs set in another way. Moving the creation of the variable inside this role means playbooks don't need to worry about setting it. fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1558252 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-03-21 19:16:11 +01:00
Sébastien Han	22f843e3d4	default: define 'osd_scenario' variable osd_scenario does not exist in the ceph-default role so if we try to play ceph-default on an OSD node, the playbook will fail with undefined variable. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-08 17:42:12 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Guillaume Abrioux	ec16cbdb1a	defaults: avoid getting stuck (ceph --connect-timeout) Sometime the playbook gets stuck because even with `--connect-timeout=` option, the connexion to the existing ceph cluster never timeout. As a workaround, using `timeout` command provided by coreutils will actually timeout if we can't connect to the cluster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537003 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-25 10:15:59 +01:00
Guillaume Abrioux	900f447c82	containers: fix bug when looking for existing cluster When containerized deployment, `docker_exec_cmd` is not set before the task which try to retrieve the current fsid is played, it means it considers there is no existing fsid and try to generate a new one. Typical error: ``` ok: [mon0 -> mon0] => { "changed": false, "cmd": [ "ceph", "--connect-timeout", "3", "--cluster", "test", "fsid" ], "delta": "0:00:00.179909", "end": "2018-01-09 10:36:58.759846", "failed": false, "failed_when_result": false, "rc": 1, "start": "2018-01-09 10:36:58.579937" } STDERR: Error initializing cluster client: Error('error calling conf_read_file: errno EINVAL',) ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-10 16:23:18 +01:00
Guillaume Abrioux	acfbebe67e	defaults: rename check_socket files for containers When containerized deployment, we are not looking for a socket but for a running container. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-10 15:44:47 +01:00
Sébastien Han	7eaf444328	default: look for the right return code on socket stat in-use As reported in https://github.com/ceph/ceph-ansible/issues/2254, the check with fuser is not ideal. If fuser is not available the return code is 127. Here we want to make sure that we looking for the correct return code, so 1. Closes: https://github.com/ceph/ceph-ansible/issues/2254 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-12-14 16:59:14 +01:00
Guillaume Abrioux	6a9b5c9632	defaults: fix CI issue with ceph_uid fact The CI complains because of `ceph_uid` fact which doesn't exist since the docker image tag used in the CI doesn't match with this condition. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-12-12 09:02:37 +01:00
John Fulton	ffae294288	Set tighter permissions on keyrings when containerized During a containerized deployment, set the permissions of ceph.client.admin.keyring and other keyrings to chmod 600 and chown it to ceph.	2017-12-06 19:22:28 -05:00
Guillaume Abrioux	238754a844	osd: skip some set_fact when osd_scenario=lvm these tasks are not needed when using `osd_scenario: lvm` Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1509230 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-07 15:30:08 +01:00
Major Hayden	f73232caa4	Use check_mode instead of always_run This patch changes the `always_run: yes` task option to `check_mode: no` to avoid Ansible warnings.	2017-10-25 09:53:34 -05:00
Major Hayden	c2b5118c1b	Revert "Avoid deprecated always_run" This reverts commit `620fb37dd4`.	2017-10-25 09:48:09 -05:00
Sébastien Han	a53aa9e8b4	ci: new osd scenarios This commit add new osd scenarios, it aims to simplify the CI setup and brings a better coverage on the OSD scenarios. We decided to differentiate between filestore and bluestore, thinking ahead when filestore won't be supported anymore. So we now have two classes of tests: * Filestore * Bluestore In each of those classes we have container and non-container. Then for each we test the following: * collocated * collocated dmcrypt * non-collocated * non-collocated dmcrypt * auto discovery collocated * auto discovery collocated dmcrypt This gives us a nice coverage and also reduces the footprint on the CI. We are now up to 4 scenarios, each containing 6 OSD VMs. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-18 09:26:06 +02:00
Major Hayden	620fb37dd4	Avoid deprecated always_run The `always_run` key is deprecated and being removed in Ansible 2.4. Using it causes a warning to be displayed: [DEPRECATION WARNING]: always_run is deprecated. This patch changes all instances of `always_run` to use the `always` tag, which causes the task to run each time the playbook runs.	2017-10-12 08:29:44 -05:00
Sébastien Han	cac7d034bf	defaults: fix check socket non-container handler Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-04 15:33:52 +02:00
Sébastien Han	5968cf09b1	ci: add collocation scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-04 11:19:12 +02:00
Sébastien Han	0ce76113bf	Merge pull request #1956 from ceph/osd-container-id Osd container	2017-10-03 18:52:24 +02:00
Sébastien Han	3bd341f6c0	osd: container use id instead of dev name Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1494127 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 14:44:00 +02:00
Guillaume Abrioux	081f226106	defaults: change running order in main.yml The task which sets `ceph_current_fsid` in `facts.yml` in case of containerized deployment, will definitely fail because `docker_exec_cmd` is not set yet. This commits simply makes `facts.yml` played after `check_socket.yml` so `docker_exec_cmd` is set properly. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-02 18:42:43 +02:00
Sébastien Han	048b55be4a	defaults: only run socket checks on their specific roles Running the socket check on all the hosts will override the default value of docker_exec_cmd, leaving it with the last value (currently rbd-mirror), as a result the subsequent docker_exec_cmd usage for the :x Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 02:38:24 +02:00
Sébastien Han	8b6456dc8a	handler: enhance socket detection We have seen issues with leftover socker. So now, if a socket is found we also check if it's accessed by a process. If so, we can run the handler, if not we remove it and continue the playbook. Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-25 13:44:51 +02:00
Sébastien Han	d100b4e596	name includes and set_fact for clarity When Ansible is not run with verbose options it's difficult to see which include and/or set_fact does what. So adding a name for each clarifies. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 23:39:46 +02:00
Sébastien Han	12f6e53090	defaults: do not restart unconfigured (yet) daemons In a collocated scenario, where you might put a rgw, a mds and a mon on the same node you don't want the handler blindly restart all the daemons on the node. Indeed some of them might not be configured yet. Implementing a more precise socket detection, for each daemon type. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488813 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-08 12:02:37 +02:00
Guillaume Abrioux	539197a2fc	Introduce new role ceph-config. This will give us more flexibility and the possibility to deploy a client node for an external ceph-cluster. related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1469426 Fixes: #1670 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-24 11:33:03 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	fc6b6e9859	Move basics facts to `ceph-defaults` Move `fsid`,`monitor_name`,`docker_exec_cmd` and `ceph_release` set_fact to `ceph-defaults` role. It will allow to reuse these facts without having to play `ceph-common` or `ceph-docker-common`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00

32 Commits (a98885a71ec63ff129d7001301a0323bfaadad8a)