ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Artur Fijalkowski	52d9d406b1	Fix in regular expression matching OSD ID on non-contenerized deployment. restart_osd_daemon.sh is used to discover and restart all OSDs on a host. To do it the scripts loops the list of ceph-osd@ services in the system. This commit fixes bug in the regular expression responsile for extraction of OSDs - prior version uses `[0-9]{1,2}` expression which is ignoring all OSDS which numbers are greater than 99 (thus longer than 2 digits). Fix removed upper limit of digits in the number. This problem existed in two places in the script. Closes: #2964 Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com>	2018-08-06 15:53:49 +00:00
Guillaume Abrioux	1164cdc002	iscsigw: install ceph-iscsi-cli package Install ceph-iscsi-cli in order to provide the `gwcli` command tool. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1602785 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-06 14:11:52 +02:00
Guillaume Abrioux	0a6ff6bbf8	defaults: backward compatibility with fqdn deployments This commit ensures we are backward compatible with fqdn deployments. Since ceph-container enforces deployment to be done with shortname, we must keep backward compatibility with clusters already deployed with fqdn configuration Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-06 10:14:58 +00:00
Sébastien Han	ea9e60d48d	config: enforce socket name This was introduced by `59ee2e8d3b` and made our socket checks impossible to run. The PID could be found, but the cctid cannot. This happens during upgrade to mimic and on cluster running on mimic. So let's force the admin socket the way it was so we can properly check for existing instances also the line $cluster-$name.$pid.$cctid.asok is only needed when running multiple instances of the same daemon, thing ceph-ansible cannot do at the time of writing Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1610220 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-31 10:58:04 +02:00
Mike Christie	6f72f96dad	igw: do not fail purge on rbd removal errors Instead of failing the entire purge operation when the rbd command fails just log an error. This will allow the higher level target and config cleanup to complete, and the user only has to manually delete the rbd images. Signed-off-by: Mike Christie <mchristi@redhat.com>	2018-07-31 10:08:26 +02:00
Mike Christie	d572a9a602	igw: fix image removal during purge We were not passing in the ceph conf info into the rbd image removal command, so if the clustername was not the default igw purge would fail due to the rbd rm command failing. This just fixes the bug by passing in the ceph conf info which has the clustername to use. This fixes Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1601949 Signed-off-by: Mike Christie <mchristi@redhat.com>	2018-07-31 10:08:26 +02:00
Sébastien Han	2ca8c51906	osd: do not remove expose_partition container The container runs with --rm which means it will be deleted by Docker when exiting. Also 'docker rm -f' is not idempotent and returns 1 if the container does not exist. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1609007 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-30 10:38:15 +02:00
Guillaume Abrioux	1ecbbbdcfa	rbd-mirror: bring back compatibility with jewel deployment rbd-mirror can't start when deploying jewel because it needs admin keyring. Getting back this task brings backward compatibility for jewel deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Guillaume Abrioux	053709da97	ceph-osds: backward compatibility with jewel for osp pools creation If we want to be backward compatible with release prior to luminous, we have to set the rule name accordingly to default values used in jewel. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Guillaume Abrioux	2597a557c5	client: fix an incorrect title in a task This task would be run on both containerized and non containerized deployment. Let's have a proper title to avoid confusion. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 15:57:41 +02:00
Sébastien Han	e2ea5bac51	rgw: add more config option for civetweb frontend In containerized deployments we now inherite from the radosgw_civetweb_options options when bootstrapping the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1582411 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-25 13:19:14 +00:00
Giulio Fidente	e85e5ea781	Run creation of empty rados index object to first monitor When distributing ceph-nfs role, creation of rados index object fails as it assumes availability of client.admin locally. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1607970 Signed-off-by: Giulio Fidente <gfidente@redhat.com>	2018-07-25 11:40:11 +02:00
Sébastien Han	235d1b3f55	validate: add checks for interfaces Check if the interface provided: * exists in the gathered facts (thus on the system) * is active * has an IP address (depending on ip_version ) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-24 17:59:30 +02:00
Guillaume Abrioux	af82e7523d	tests: test master against ansible 2.6 Ansible 2.4 is currently end-of-life. Ansible 2.5 will go end-of-life after Ansible 2.7 is released. Fixes: #2901 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 11:59:15 +00:00
Sébastien Han	7fc13bc9d5	validate: only run osd test on osd node Do not run device validation on every hosts, only on OSD nodes. Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-19 12:46:18 +00:00
Sébastien Han	cf01e596b6	valide: improve device check We know make sure that: * devices are actually block special files * length of dedicated_device is identical to devices Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-18 14:26:22 +00:00
Guillaume Abrioux	1a626d3c61	nfs: change default stable branch for nfs-ganesha repo Since `V2.6-stable` is available and has packages for `mimic`, let's update this default value accordingly so nfs nodes can be deployed with mimic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 08:20:27 +00:00
Sébastien Han	e61ca882a1	validate: force ansible version We currently only support Ansible 2.4.X so let's fail if the version is different. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-13 07:52:56 +00:00
Guillaume Abrioux	5ef5fcd0b6	client: do not rely on copy_admin_key to import keys Relying on `copy_admin_key` to import created keys on client nodes makes us obliged to copy admin key on those nodes which is not something we might want. We should use the fact `condition_copy_admin_key` which will be set to `True` when the delegated node is a mon which means we can import keys without taking care of admin keyring. Fixes: #2867 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 06:52:00 +00:00
Guillaume Abrioux	ce5ac930c5	mgr: fix condition to add modules to ceph-mgr Follow up on #2784 We must check in the generated fact `_disabled_ceph_mgr_modules` to enable disabled mgr module. Otherwise, this task will be skipped because it's not comparing the right list. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600155 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-12 21:04:01 +00:00
Guillaume Abrioux	9f54b3b4a7	mon: ensure socker is purged when mon is stopped On containerized deployment, if a mon is stopped, the socket is not purged and can cause failure when a cluster is redeployed after the purge playbook has been run. Typical error: ``` fatal: [osd0]: FAILED! => {} MSG: 'dict object' has no attribute 'osd_pool_default_pg_num' ``` the fact is not set because of this previous failure earlier: ``` ok: [mon0] => { "changed": false, "cmd": "docker exec ceph-mon-mon0 ceph --cluster test daemon mon.mon0 config get osd_pool_default_pg_num", "delta": "0:00:00.217382", "end": "2018-07-09 22:25:53.155969", "failed_when_result": false, "rc": 22, "start": "2018-07-09 22:25:52.938587" } STDERR: admin_socket: exception getting command descriptions: [Errno 111] Connection refused MSG: non-zero return code ``` This failure happens when the ceph-mon service is stopped, indeed, since the socket isn't purged, it's a leftover which is confusing the process. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 20:08:07 +00:00
Guillaume Abrioux	d0746e0858	common: switch from docker module to docker_container As of ansible 2.4, `docker` module has been removed (was deprecated since ansible 2.1). We must switch to `docker_container` instead. See: https://docs.ansible.com/ansible/latest/modules/docker_module.html#docker-module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 20:08:07 +00:00
Shilpa Jagannath	07852ed039	Remove zone from zonegroup and update period before deleting the zone to avoid inconsistent period information across other zones. When you delete a zone without removing from zonegroup, the period update would fail since that command needs to load the zone and zonegroup to be able to update the master. Period update would fail with an error like this: radosgw-admin period update --commit -1 Cannot find zone id= (name=), switching to local zonegroup configuration -1 Cannot find zone id= (name=) Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>	2018-07-09 12:27:24 +00:00
Sébastien Han	b9f7df7ba2	common: remove hdparm As of Kraken, the journal code does not use the hdparm command anymore so we can remove it from our package dependency list. Fixes: https://github.com/ceph/ceph-ansible/issues/1402 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit f6910efa24389c264062963b2054c7cd29ffebb3)	2018-07-07 08:53:47 +00:00
Sébastien Han	713b9fcf9b	ceph-config: do not log cluster log on container The container image recently merged both cluster and mon log into a single stream. Following this, we now see this warning coming from the container image: 2018-06-19 13:44:01.542990 7ff75b024700 1 mon.vm02@1(peon).log v57928205 unable to write to '/var/log/ceph/ceph.log' for channel 'cluster': (2) No such file or directory So we now tell the mon to not log cluster log on the filesystem. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1591771 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-05 15:11:45 +00:00
Sébastien Han	fcf11ecc35	ceph-common: fix rhcs condition We forgot to add mgr_group_name when checking for the mon repo, thus the conditional on the next task was failing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1598185 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-04 17:17:21 +02:00
Guillaume Abrioux	3abc253fec	mgr: fix enabling of mgr module on mimic The data structure has slightly changed on mimic. Prior to mimic, it used to be: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ "balancer", "dashboard", "influx", "localpool", "prometheus", "restful", "selftest", "zabbix" ] } ``` From mimic it looks like this: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ { "name": "balancer", "can_run": true, "error_string": "" }, { "name": "dashboard", "can_run": true, "error_string": "" } ] } ``` This means we can't simply check if `item` is in `item in _ceph_mgr_modules.disabled_modules` the idea here is to use filter `map(attribute='name')` to build a list when deploying mimic. Fixes: #2766 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-03 21:19:16 +00:00
Sébastien Han	63658c05c7	ceph-client: do not kill the dummy container The container runs for 300 sec, then dies and removes itself thanks to the '--rm' option, so there is no point of removing it. Also this is causing failure under some circonstances. Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1568157 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-03 16:09:52 +00:00
Sébastien Han	a629408967	ceph-mds: enable application pool We now enable the application type 'cephfs' for each cephfs pools we create. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590275 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-02 10:28:34 +00:00
Sébastien Han	103c279c21	ceph-defaults: add default application to pool We now add a default 'rbd' application type to each pool we create. This will remove the warning: " application not enabled on N pool(s) " Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590275 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-02 10:28:34 +00:00
Vasu Kulkarni	1d454b611f	Enable monitor repo for mgr nodes and Tools repo for iscsi/nfs/clients Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2018-06-29 18:09:26 +00:00
Sébastien Han	abdb53e16a	ceph-osd: trigger osd container restart on script change The script ceph-osd-run.sh holds the config options to start the container, if one of these options are modified we must restart the container. This was not the case before becauase the 'notify' flag wasn't present. Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1596061 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-28 17:54:13 +02:00
Sébastien Han	f623997271	systemd: remove changed_when: false When using a module there is no need to apply this Ansible option. The module will handle the idempotency on its own. So the module decides wether or not the task has changed during the execution. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-28 17:54:13 +02:00
George Shuklin	653b483fc3	Add ceph_keyring_permissions variable to control permissions for keyring files in /etc/ceph. Default value is the same as it was (0600), but this variable allows user to override it (f.e. set it to 0640). Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2018-06-28 15:48:39 +00:00
Ha Phan	a7b7735b6f	ceph-mon: Generate initial keyring Minor fix so that initial keyring can be generated using python3. Signed-off-by: Ha Phan <thanhha.work@gmail.com>	2018-06-28 10:39:56 +02:00
Ha Phan	b7b8aba47b	Generate a copy of ceph.conf locally Refers to #2697 This change creates a copy of `ceph.conf` in ansible server. Signed-off-by: Ha Phan <thanhha.work@gmail.com>	2018-06-28 07:39:30 +00:00
Andy McCrae	a4a3d9a01b	Fix package state for upgrades on SuSE/RHEL During `226f80c22b` only Debian package installs had the correct state set to ensure packages were upgraded when the "upgrade_ceph_packages" var was set to true. Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>	2018-06-27 18:55:22 +00:00
Sébastien Han	322e2de7d2	mon: honour mon_docker_net_host option --net=host was hardcoded in the startup line so even though mon_docker_net_host was set to False the net option would always be activated. mon_docker_net_host is set to True by default so this commit does not change the behaviour. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-27 13:44:41 +00:00
Michel Rode	7774935707	Added 'squash' as a parameter to nfs-ganesha. Set the default to 'root_squash' - which is the default of nfs-ganesha. Signed-off-by: Michel Rode <rmichel@devnu11.net>	2018-06-25 09:13:17 +02:00
Christian Zunker	48394597c9	reset failed count of ceph-mgr Depending on your setup, ceph-mgr might get restarted multiple times. When this is done to fast, systemd will prevent further restarts because of configured limits in the ceph-mgr systemd unit file. Resetting the failure count will prevent this problem. The reset is done before the restart so in case of a real problem during the restart it still fails. Fixes: #2768 Signed-off-by: Christian Zunker <christian.zunker@codecentric.cloud>	2018-06-20 13:59:16 +02:00
Sébastien Han	bea4027f0c	common: start firewalld if configure_firewall Currently we expect that if configure_firewall is set to True to have firewalld enabled and running. Let's enforce that. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1589146 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-18 04:02:50 -04:00
Sébastien Han	a9ed3579ae	mon/osd: bump container memory limit As discussed with the cores, the current limits are too low and should be bumped to higher value. So now by default monitors get 3GB and OSDs get 5GB. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1591876 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-17 11:20:27 -04:00
Guillaume Abrioux	51cf3b7fa0	client: try to kill dummy container only on first client node The 'dummy' container is created only on first client node, it means we must seek to destroy this container only on this node, otherwise this can cause failure like following : ``` fatal: [192.168.24.8]: FAILED! => {"changed": false, "cmd": ["docker", "rm", "-f", "ceph-create-keys"], "delta": "0:00:00.023692", "end": "2018-06-12 20:56:07.261278", "msg": "non-zero return code", "rc": 1, "start": "2018-06-12 20:56:07.237586", "stderr": "Error response from daemon: No such container: ceph-create-keys", "stderr_lines": ["Error response from daemon: No such container: ceph-create-keys"], "stdout": "", "stdout_lines": []} ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590746 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-13 16:10:46 +02:00
Patrick Donnelly	9ce81ae845	ceph-mds: do not enable multimds on jewel Multiple active MDS became stable in Luminous. Introduced-by: `c8573fe0d7` Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-12 10:47:34 +02:00
Sébastien Han	2e8412734a	common: ability to enable/disable fw configuration Prior to this patch if you were running on a Red Hat system, ceph-ansible would try to configure firewalld for you without the operators's consent. Now you can enable or disable the fw configuration by setting configure_firewall to either true or false. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1589146 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-11 21:51:59 +02:00
Konstantin Shalygin	3a07568496	ceph-osd: set 'openstack_keys_tmp' only when 'openstack_config' is defined. If 'openstack_config' is false this task shouldn't be executed. Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>	2018-06-11 13:03:55 +02:00
Vishal Kanaujia	1a610df02b	Fix to run secure cluster only once in a run The current secure cluster play runs with all the monitors. The rerun of this task is unnecessary and can be skipped. Fixes: #2737 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-06-11 08:37:29 +02:00
Guillaume Abrioux	090ecff94e	client: keyrings aren't created when single client node combining `run_once: true` with `inventory_hostname == groups.get(client_group_name) \| first` might cause bug when the only node being run is not the first in the group. In a deployment with a single client node it might cause issue because sometimes keyring won't be created since the task could be definitively skipped. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-08 15:05:47 +02:00
Sébastien Han	20c8065e48	ceph-iscsi: rename group iscsi_gws Let's try to avoid using dashes as testinfra needs to be able to read the groups. Typically, with iscsi-gws we can't add a marker for these iscsi nodes, using an underscore fixes the issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Sébastien Han	91bf53ee93	ceph-iscsi: support for containerize deployment We now have the ability to deploy a containerized version of ceph-iscsi. The result is similar to the non-containerized version, you simply have 3 containers running for the following services: * rbd-target-api * rbd-target-gw * tcmu-runner Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508144 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Guillaume Abrioux	8a653cacd5	client: add a default value for keyring file Potential error if someone doesnt pass the mode in `keys` dict for client nodes: ``` fatal: [client2]: FAILED! => {} MSG: The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'mode' The error appears to have been in '/home/guits/ceph-ansible/roles/ceph-client/tasks/create_users_keys.yml': line 117, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: get client cephx keys ^ here exception type: <class 'ansible.errors.AnsibleUndefinedVariable'> exception: 'dict object' has no attribute 'mode' ``` adding a default value will avoid the deployment failing for this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 17:26:35 +02:00
Guillaume Abrioux	5eacc8f8d8	tests: add a dummy value for 'dev' release Functional tests are broken when testing against 'dev' release (ceph). Adding a dummy value here will make it possible to run ceph-ansible CI against dev ceph release. Typical error: ``` > if request.node.get_marker("from_luminous") and ceph_release_num[ceph_stable_release] < ceph_release_num['luminous']: E KeyError: 'dev' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fd1487d93f21b609a637053f5b33cd2a4e408d00)	2018-06-07 13:59:17 +02:00
Andrew Schoen	24ef47b0e5	ceph-common: move firewall checks after package installation We need to do this because on dev or rhcs installs ceph_stable_release is not mandatory and the firewall check tasks have a task that is conditional based off the installed version of ceph. If we perform those checks after package install then they will not fail on dev or rhcs installs. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-06-07 13:59:17 +02:00
Guillaume Abrioux	7b156deb67	client: use dummy created container when there is no mon in inventory the `docker_exec_cmd` fact set in client role when there is no monitor in inventory is wrong, `ceph-client-{{ hostname }}` is never created so it will fail anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 16:16:38 +08:00
Guillaume Abrioux	433ecc7cbc	osd: copy openstack keys over to all mon When configuring openstack, the created keyrings aren't copied over to all monitors nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 13:58:57 +08:00
Patrick Donnelly	91f9da530f	change max_mds default to 1 Otherwise, with the removal of mds_allow_multimds, the default of 3 will be set on every new FS. Introduced by: `c8573fe0d7` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1583020 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-06-06 12:16:42 +08:00
Vishal Kanaujia	2cdb0d1812	Syntax error fix in rgw multisite role This checkin fixes a syntax error in RGW multisite role under when clause. Fixes: #2704 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-06-05 16:01:07 +05:30
Guillaume Abrioux	2cf06b515f	rgw: refact rgw pools creation Refact of `8704144e31` There is no need to have duplicated tasks for this. The rgw pools creation should be delegated on a monitor node se we don't have to care if the admin keyring is present on rgw node. By the way, only one task is needed to create the pools, we just need to use the `docker_exec_cmd` fact already defined in `ceph-defaults` to achieve it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:00:20 +08:00
Ha Phan	1f3c9ce4f3	Use python instead of python2 The initial keyring is generated from ansible server locally and the snippet works well for both v2 and v3 of python. I don't see any reason why we should explicitly invoke`python2` instead of just `python`. In some setups, `python2` is not symlinked to `python`; while `python` and `python3` refer to v2 and v3 respectively. Signed-off-by: Ha Phan <thanhha.work@gmail.com>	2018-06-04 14:24:10 +02:00
Sébastien Han	db50aec13d	ceph-common: add firewall rules for ceph-mgr Prior to this commit the firewall tasks were not opening the ceph-mgr ports. This would lead to unclean configuration since the ceph-mgr daemons can not connect to the OSDs. Thi commit opens the right ports on the ceph-mgr nodes to talk with the OSDs. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526400 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-04 12:11:41 +02:00
jtudelag	600e1e2c26	rgws: renames create_pools variable with rgw_create_pools. Renamed to be consistent with the role (rgw) and have a meaningful name. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
jtudelag	8704144e31	Adds RGWs pool creation to containerized installation. ceph command has to be executed from one of the monitor containers if not admin copy present in RGWs. Task has to be delegated then. Adds test to check proper RGW pool creation for Docker container scenarios. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
Guillaume Abrioux	aae37b44f5	mons: move set_fact of openstack_keys in ceph-osd Since the openstack_config.yml has been moved to `ceph-osd` we must move this `set_fact` in ceph-osd otherwise the tasks in `openstack_config.yml` using `openstack_keys` will actually use the defaults value from `ceph-defaults`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1585139 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 17:12:01 +02:00
Andrew Schoen	c2423e2c48	ceph-defaults: add the nautilus 14.x entry to ceph_release_num The first 14.x tag has been cut so this needs to be added so that version detection will still work on the master branch of ceph. Fixes: https://github.com/ceph/ceph-ansible/issues/2671 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-06-01 16:51:23 +02:00
Guillaume Abrioux	9d5265fe11	osds: wait for osds to be up before creating pools This is a follow up on #2628. Even with the openstack pools creation moved later in the playbook, there is still an issue because OSDs are not all UP when trying to create pools. Adding a task which checks for all OSDs to be UP with a `retries/until` condition should definitively fix this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 15:46:52 +02:00
Guillaume Abrioux	c68126d6fd	mdss: do not make pg_num a mandatory params When playing ceph-mds role, mon nodes have set a fact with the default pg num for osd pools, we can simply default to this value for cephfs pools (`cephfs_pools` variable). At the moment the variable definition for `cephfs_pools` looks like: ``` cephfs_pools: - { name: "{{ cephfs_data }}", pgs: "" } - { name: "{{ cephfs_metadata }}", pgs: "" } ``` and we have a task in `ceph-validate` to ensure `pgs` has been set to a valid value. We could simply avoid this check by setting the default value of `pgs` to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and let to users the possibility to override this value. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:20:34 +02:00
Guillaume Abrioux	34e646e767	osds: do not set docker_exec_cmd fact in `ceph-osd` there is no need to set `docker_exec_cmd` since the only place where this fact is used is in `openstack_config.yml` which delegate all docker command to a monitor node. It means we need the `docker_exec_cmd` fact that has been set referring to `ceph-mon-*` containers, this fact is already set earlier in `ceph-defaults`. By the way, when collocating an OSD with a MON it fails because the container `ceph-osd-{{ ansible_hostname }}` doesn't exist. Removing this task will allow to collocate an OSD with a MON. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1584179 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:17:29 +02:00
Guillaume Abrioux	608ea947a9	mds: move mds fs pools creation When collocating mds on monitor node, the cephpfs will fail because `docker_exec_cmd` is reset to `ceph-mds-monXX` which is incorrect because we need to delegate the task on `ceph-mon-monXX`. In addition, it wouldn't have worked since `ceph-mds-monXX` container isn't started yet. Moving the task earlier in the `ceph-mds` role will fix this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-25 11:16:56 +02:00
Sébastien Han	1c084efb3c	rgw: container add option to configure multi-site zone You can now use RGW_ZONE and RGW_ZONEGROUP on each rgw host from your inventory and assign them a value. Once the rgw container starts it'll pick the info and add itself to the right zone. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1551637 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 11:32:05 -07:00
Guillaume Abrioux	3a0e168a76	mdss: move cephfs pools creation in ceph-mds When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move cephfs pools creation in `ceph-mds` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Guillaume Abrioux	564a662baf	osds: move openstack pools creation in ceph-osd When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move openstack pools creation at the end of `ceph-osd` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Luigi Toscano	43e96c1f98	ceph-radosgw: disable NSS PKI db when SSL is disabled The NSS PKI database is needed only if radosgw_keystone_ssl is explicitly set to true, otherwise the SSL integration is not enabled. It is worth noting that the PKI support was removed from Keystone starting from the Ocata release, so some code paths should be changed anyway. Also, remove radosgw_keystone, which is not useful anymore. This variable was used until `fcba2c801a`. Now profiles drives the setting of rgw keystone *. Signed-off-by: Luigi Toscano <ltoscano@redhat.com>	2018-05-23 23:24:09 -07:00
Vishal Kanaujia	ef5f52b1f3	Skip GPT header creation for lvm osd scenario The LVM lvcreate fails if the disk already has a GPT header. We create GPT header regardless of OSD scenario. The fix is to skip header creation for lvm scenario. fixes: https://github.com/ceph/ceph-ansible/issues/2592 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-05-23 11:44:09 -07:00
Subhachandra Chandra	c7e269fcf5	Fix restarting OSDs twice during a rolling update. During a rolling update, OSDs are restarted twice currently. Once, by the handler in roles/ceph-defaults/handlers/main.yml and a second time by tasks in the rolling_update playbook. This change turns off restarts by the handler. Further, the restart initiated by the rolling_update playbook is more efficient as it restarts all the OSDs on a host as one operation and waits for them to rejoin the cluster. The restart task in the handler restarts one OSD at a time and waits for it to join the cluster.	2018-05-22 19:23:07 +02:00
Andrew Schoen	a9ad8eb5f3	ceph-validate: do not check ceph version on dev or rhcs installs A dev or rhcs install does not require ceph_stable_release to be set and instead generates that by looking at the installed ceph-version. However, at this point in the playbook ceph may not have been installed yet and ceph-common has not be run. Fixes: https://github.com/ceph/ceph-ansible/issues/2618 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-21 23:11:04 +02:00
Andrew Schoen	e7d02a50d8	ceph-validate: move system checks from ceph-common to ceph-validate Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	645f61c351	ceph-defaults: remove backwards compat for containerized_deployment The validation module does not get config options with the template syntax rendered, so we're gonna remove that and just default it to False. The backwards compat was schedule to be removed in 3.1 anyway. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	d30a99c350	validate: add support for containerized_deployment Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	f84c2ba27b	ceph-defaults: fix failing tasks when osd_scenario was not set correctly When devices is not defined because you want to use the 'lvm' osd_scenario but you've made a mistake selecting that scenario these tasks should not fail. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	1f15a81c48	ceph-defaults: move cephfs vars from the ceph-mon role We're doing this so we can validate this in the ceph-validate role Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	ffe05872ac	validate: only validate cephfs_pools on mon nodes Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	48c2a4fda8	validate: check rados config options Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	377fe81c10	validate: make sure ceph_stable_release is set to the correct value Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	ba7f09c0a7	ceph-validate: move var checks from ceph-common into this role Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	32bac6b491	ceph-validate: move var checks from ceph-osd into this role Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	29a9dffc83	ceph-validate: move ceph-mon config checks into this role Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	d87a32347f	adds a new ceph-validate role This will be used to validate config given to ceph-ansible. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Sébastien Han	2f43e9dab5	defaults: restart_osd_daemon unit spaces Extra space in systemctl list-units can cause restart_osd_daemon.sh to fail It looks like if you have more services enabled in the node space between "loaded" and "active" get more space as compared to one space given in command the command[1]. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1573317 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-18 17:53:47 +02:00
Michael Vollman	ed050bf3f6	Do nothing when mgr module is in good state Check whether a mgr module is supposed to be disabled before disabling it and whether it is already enabled before enabling it. Signed-off-by: Michael Vollman <michael.b.vollman@gmail.com>	2018-05-18 15:21:45 +02:00
Andy McCrae	f45662e270	Fix template reference for ganesha.conf We can simply reference the template name since it exists within the role that we are calling. We don't need to check the ANSIBLE_ROLE_PATH or playbooks directory for the file.	2018-05-17 15:23:52 +02:00
Andy McCrae	226f80c22b	Install packages as a list To make the package installation more efficient we should install packages as a list rather than as individual tasks or using a "with_items" loop. The package managers can handle a list passed to them to install in one go. We can use a specified list and substitute any packages that are not to be installed with the ceph-common package, which is installed on every package install, then apply the unique filter to the package install list.	2018-05-16 09:59:00 +02:00
Guillaume Abrioux	f749830897	mon: refactor of mgr key fetching There is no need to stat for created mgr keyrings since they are created anyway when deploying a ceph cluster > jewel. In case of a jewel deployment we won't enter that block. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-16 09:44:58 +02:00
Guillaume Abrioux	926be51b44	mgr: delete copy_configs.yml (containerized) This file is a leftover from PR ceph/ceph-ansible#2516 It is not used anymore so it can be removed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-15 19:30:34 +02:00
Sébastien Han	52fc8a0385	rolling_update: move mgr key creation Until all the mons haven't been updated to Luminous, there is no way to create a key. So we should do the key creation in the mon role only if we are not part of an update. If we are then the key creation is done after the mons upgrade to Luminous. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-15 09:01:42 +02:00
Sébastien Han	e810fb217f	Revert "mon: fix mgr keyring creation when upgrading from jewel" This reverts commit `259fae931d`.	2018-05-15 09:01:42 +02:00
Guillaume Abrioux	a145caf947	iscsi-gw: fix issue when trying to mask target trying to mask target when `/etc/systemd/system/target.service` doesn't exist seems to be a bug. There is no need to mask a unit file which doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-14 21:42:22 +02:00
Sébastien Han	8c7c11b774	iscsi: add python-rtslib repository Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-14 21:42:22 +02:00
Andy McCrae	08a2b58d39	Allow os_tuning_params to overwrite fs.aio-max-nr The order of fs.aio-max-nr (which is hard-coded to 1048576) means that if you set fs.aio-max-nr in os_tuning_params it will effectively be ignored for bluestore scenarios. To resolve this we should move the setting of fs.aio-max-nr above the setting of os_tuning_params, in this way the operator can define the value of fs.aio-max-nr to be something other than 1048576 if they want to. Additionally, we can make the sysctl settings happen in 1 task rather than multiple.	2018-05-11 10:49:37 +01:00
Guillaume Abrioux	f60b049ae5	client: remove default value for pg_num in pools creation trying to set the default value for pg_num to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num'])` will break in case of external client nodes deployment. the `pg_num` attribute should be mandatory and be tested in future `ceph-validate` role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-10 11:51:02 -07:00
Gregory Meno	26f6a65042	adds missing state needed to upgrade nfs-ganesha in tasks for os_family Red Hat we were missing this fixes: bz1575859 Signed-off-by: Gregory Meno <gmeno@redhat.com>	2018-05-09 19:58:04 +00:00
Guillaume Abrioux	259fae931d	mon: fix mgr keyring creation when upgrading from jewel On containerized deployment, when upgrading from jewel to luminous, mgr keyring creation fails because the command to create mgr keyring is executed on a container that is still running jewel since the container is restarted later to run the new image, therefore, it fails with bad entity error. To get around this situation, we can delegate the command to create these keyrings on the first monitor when we are running the playbook on the last monitor. That way we ensure we will issue the command on a container that has been well restarted with the new image. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-09 10:29:48 -07:00
Guillaume Abrioux	7b387b506a	osd: clean legacy syntax in ceph-osd-run.sh.j2 Quick clean on a legacy syntax due to `e0a264c7e` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-09 07:29:33 +02:00
Simone Caronni	b12bf62c36	Make sure the restart_mds_daemon script is created with the correct MDS name	2018-05-08 20:53:15 +02:00
Sébastien Han	07ca91b5cb	common: enable Tools repo for rhcs clients Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574458 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-08 16:12:30 +02:00
Andy McCrae	e99351b95b	Fix install of nfs-ganesha-ceph for Debian/SuSE The Debian and SuSE installs for nfs-ganesha on the non-rhcs repository requires you to allow_unauthenticated for Debian, and disable_gpg_check for SuSE. The nfs-ganesha-rgw package already does this, but the nfs-ganesha-ceph package will fail to install because of this same issue. This PR moves the installations to happen when the appropriate flags are set to True (nfs_obj_gw & nfs_file_gw), but does it per distro (one for SuSE and one for Debian) so that the appropriate flag can be passed to ignore the GPG check.	2018-05-04 15:13:59 +02:00
Ramana Raja	31762dede3	ceph-nfs: disable attribute caching When 'ceph_nfs_disable_caching' is set to True, disable attribute caching done by Ganesha for all Ganesha exports. Signed-off-by: Ramana Raja <rraja@redhat.com>	2018-05-04 09:47:54 +02:00
Sébastien Han	4a186237e6	common: copy iso files if rolling_update If we are in a middle of an update we want to get the new package version being installed so the task that copies the repo files should not be skipped. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572032 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-03 17:18:55 +02:00
Andy McCrae	d142be0422	Move apt cache update to individual task per role The apt-cache update can fail due to transient issues related to the action being a network operation. To reduce the impact of these transient failures this patch adds a retry to the update_cache task. However, the apt_repository tasks which would perform an apt_update won't retry the apt_update on a failure in the same way, as such this PR moves the apt_update into an individual task, once per role. Finally, the apt_repository tasks no longer have a changed_when: false, and the apt_cache update is only performed once per role, if the repositories change. Otherwise the cache is updated on the "apt" install tasks if the cache_timeout has been reached.	2018-05-03 14:02:15 +02:00
Guillaume Abrioux	6fe8df627b	client: fix pool creation the value in `docker_exec_client_cmd` doesn't allow to check for existing pools because it's set with a wrong value for the entrypoint that is going to be used. It means the check were going to fail anyway even if pools actually exist. Using jinja syntax to set `docker_exec_cmd` allows to handle the case where you don't have monitors in your inventory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-03 08:22:40 +02:00
Sébastien Han	43e23ffe4d	mon: change application pool support If openstack_pools contains an application key it will be used to apply this application pool type to a pool. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1562220 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-30 09:42:58 +02:00
Guillaume Abrioux	75ed437d4e	check if pools already exist before creating them Add a task to check if pools already exist before we create them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-30 08:15:18 +02:00
Guillaume Abrioux	a68091c923	tests: update the type for the rule used in pools As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but an id, therefore an `int` is expected otherwise it will fail with the following error Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-30 08:15:18 +02:00
Sébastien Han	12eebc31fb	mon/client: honor key mode when copying it to other nodes The last mon creates the keys with a particular mode, while copying them to the other mons (first and second) we must re-use the mode that was set. The same applies for the client node, the slurp preserves the initial 'item' so we can get the mode for the copy. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Sébastien Han	74494253fa	mon: remove redundant copy task We had twice the same task, also one was overriding the mode. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Sébastien Han	85732d11b9	mon/client: remove acl code Applying ACL on the keyrings is not used anymore so let's remove this code. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Sébastien Han	cfe8e51d99	mon/client: apply mode from ceph_key Do not use a dedicated task for this but use the ceph_key module capability to set file mode. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Di Xu	113eb25424	add AArch64 to supported architecture works on AArch64 platform	2018-04-23 10:23:21 +02:00
Sébastien Han	949507d304	mon: remove mgr key from ceph_config_keys This key is created after the last mon is up so there is no need to try to push it from the first mon. The initia mon container is not creating the mgr key, ansible does. So this key will never exist. The key will go into the fetch dir once the last mon is up, then when the ceph-mgr plays it will try to get it from the fetch directory. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 10:17:24 +02:00
Sébastien Han	35c1eb7183	mon: remove mon map from ceph_config_keys During the initial bootstrap of the first mon, the monmap file is destroyed so it's not available and ansible will never find it. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 10:17:24 +02:00
Sébastien Han	65ba85aff6	Expose /var/run/ceph Useful for softwares that do data collection/monitoring like collectd. They can connect to the socket and then retrieve information. Even though the sockets are exposed now, I'm keeping the docker exec to check the socket, this will allow newer version of ceph-ansible to work with older versions. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-20 15:48:32 +02:00
Sébastien Han	bf1e70e8cf	default: extent ceph_uid and gid We now have the ability to detect the uid/gid of the ceph user depending on the distribution we are running on and so we are doing non-container deployements. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-20 15:48:32 +02:00
Sébastien Han	f3656ad167	move create ceph initial directories to default This is needed for both non-container and container deployments. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-20 15:48:32 +02:00
Sébastien Han	641f141c0f	selinux: remove chcon calls We know bindmount with the :z option at the end of the -v command so this will basically run the exact same command as we used to run. So to speak: chcon -Rt svirt_sandbox_file_t /var/lib/ceph Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-19 14:59:37 +02:00
Sébastien Han	90e47c5fb0	client: add a --rm option to run the container This fixes the case where the playbook died and never removed the container. So now, once the container exits it will remove itself from the container list. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-19 14:59:37 +02:00
Sébastien Han	6c742376fd	client: import the key in ceph is copy_admin_key is true If the user has set copy_admin_key to true we assume he/she wants to import the key in Ceph and not only create the key on the filesystem. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-18 17:46:54 +02:00
Sébastien Han	424815501a	client: add quotes to the dict values ceph-authtool does not support raw arguements so we have to quote caps declaration like this allow 'bla bla' instead of allow bla bla Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-18 17:46:54 +02:00
Sébastien Han	d2a2793cb0	refactor the way we copy keys This commit does a couple of things: * use a common.yml file that contains things that can be played on both container and non-container * refactor the ability to copy the admin key to the nodes Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-18 16:46:33 +02:00
Randy J. Martinez	127a643fd0	ceph-defaults: fix ceph_uid fact on container deployments Red Hat is now using tags[3,latest] for image rhceph/rhceph-3-rhel7. Because of this, the ceph_uid conditional passes for Debian when 'ceph_docker_image_tag: latest' on RH deployments. I've added an additional task to check for rhceph image specifically, and also updated the RH family task for ceph/daemon [centos\|fedora]tags. Signed-off-by: Randy J. Martinez <ramartin@redhat.com>	2018-04-17 16:54:51 +02:00
Sébastien Han	a98885a71e	rhcs: re-add apt-pining When installing rhcs on Debian systems the red hat repos must have the highest priority so we avoid packages conflicts and install the rhcs version. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1565850 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-17 16:07:06 +02:00
Guillaume Abrioux	899b0eb451	defaults: check only 1 time if there is a running cluster There is no need to check for a running cluster n*nodes time in `ceph-defaults` so let's add a `run_once: true` to save some resources and time. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-16 11:23:00 +02:00
Sébastien Han	5bbbce527e	osd: do not do anything if the dev has a partition Regardless if the partition is 'ceph' or something else, we don't want to be as strick as checking for a particular partition. If the drive has a partition, we just don't do anything. This solves the case where the server reboots, disks get a different /dev/sda (node) allocation. In this case, prior to restarting the server /dev/sda was an OSD, but now it's /dev/sdb and the other way around. In such scenario, we will try to prepare the OSD and create a new partition, so let's not mess around with devices that have partitions. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-13 19:11:15 +02:00
Sébastien Han	37117071eb	common: add tools repo for iscsi gw To install iscsi gw packages we need to enable the tools repo. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1547849 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-12 13:38:34 +02:00
Douglas Fuller	c8573fe0d7	Remove deprecated allow_multimds allow_multimds will be officially deprecated in Mimic, specify it only for all versions of Ceph where it was declared stable. Going forward, specify only max_mds. Signed-off-by: Douglas Fuller <dfuller@redhat.com>	2018-04-12 10:29:17 +02:00
vasishta p shastry	020e66c1b4	Fixed a typo (extra space)	2018-04-11 14:21:15 +02:00
vasishta p shastry	e1a1f81b6f	osd: to support copy_admin_key	2018-04-11 14:21:15 +02:00
vasishta p shastry	db3a5ce6d9	mds: to support copy_admin_keyring	2018-04-11 14:21:15 +02:00
vasishta p shastry	6b59416f75	nfs: to support copy_admin_key - containerized	2018-04-11 14:21:15 +02:00
Ali Maredia	01c58695fc	nfs: ensure nfs-server server is stopped NFS-ganesha cannot start is the nfs-server service is running. This commit stops nfs-server in case it is running on a (debian, redhat, suse) node before the nfs-ganesha service starts up fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2018-04-11 14:00:48 +02:00
Ramana Raja	4a430ae29a	ceph-nfs: allow disabling ganesha caching Add a variable, ceph_nfs_disable_caching, that if set to true disables ganesha's directory and attribute caching as much as possible. Also, disable caching done by ganesha, when 'nfs_file_gw' variable is true, i.e., when Ganesha is used as CephFS's gateway. This is the recommended Ganesha setting as libcephfs already caches information. And doing so helps avoid cache incoherency issues especially with clustered ganesha over CephFS. Fixes: https://tracker.ceph.com/issues/23393 Signed-off-by: Ramana Raja <rraja@redhat.com>	2018-04-11 13:56:40 +02:00
Sébastien Han	82ccbdafbc	ceph-defaults: bring backward compatibility for old syntax If people keep on using the mon_cap, osd_cap etc the playbook will translate this old syntax on the flight. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	9657e4d6fa	ceph_key: use ceph_key in the playbook Replaced all the occurence of raw command using the 'command' module with the ceph_key module instead. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Guillaume Abrioux	66c4118dcd	defaults: fix backward compatibility backward compatibility with `ceph_mon_docker_interface` and `ceph_mon_docker_subnet` was not working since there wasn't lookup on `monitor_interface` and `public_network` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-10 00:19:11 +02:00
Ken Dreyer	3752cc6f38	common: upgrade/install ceph-test RPM first Prior to this change, if a user had ceph-test-12.2.1 installed, and upgraded to ceph v12.2.3 or newer, the RPM upgrade process would fail. The problem is that the ceph-test RPM did not depend on an exact version of ceph-common until v12.2.3. In Ceph v12.2.3, ceph-{osdomap,kvstore,monstore}-tool binaries moved from ceph-test into ceph-base. When ceph-test is not yet up-to-date, Yum encounters package conflicts between the older ceph-test and newer ceph-base. When all users have upgraded beyond Ceph < 12.2.3, this is no longer relevant.	2018-04-09 18:09:52 +02:00
Sébastien Han	bb60f2fea4	ceph-defaults: fix ceoh_uid for container image tag latest According to our recent change, we now use "CentOS" as a latest container image. We need to reflect this on the ceph_uid. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-09 13:54:55 +02:00
Zack Cerza	0123d790cd	Use the CentOS repo for Red Hat dev packages No use even trying to use something that doesn't exist. Signed-off-by: Zack Cerza <zack@redhat.com>	2018-04-09 10:05:57 +02:00
Attila Fazekas	ecd3563c21	Deploying without managed monitors failed Tripleo deployment failed when the monitors not manged by tripleo itself with: FAILED! => {"msg": "list object has no element 0"} The failing play item was introduced by `f46217b69a` . fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1552327 Signed-off-by: Attila Fazekas <afazekas@redhat.com>	2018-04-04 18:16:46 +02:00
Guillaume Abrioux	dcf6a246a4	defaults: remove `run_once: true` when creating fetch_directory because of `serial: 1`, it can be an issue when the playbook is being run on client nodes. Since the refact of `ceph-client` we skip the role `ceph-defaults` on every node except the first client node, it means that the task is not going to be played because of `run_once: true`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Guillaume Abrioux	18c0c7a508	config: use fact `ceph_uid` Use fact `ceph_uid` in the task which ensures `/etc/ceph` exists in containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Guillaume Abrioux	9c979c6390	clients: refact `ceph-clients` role This commit refacts this role so we don't have to pull container image on client nodes just to create pools and keys. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1550977 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00
Guillaume Abrioux	cefd471967	client: remove legacy code This seems to be a leftover. This commit removes an unnecessary 'set linux permissions' on `/var/lib/ceph` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-04 10:51:17 +02:00

1 2 3 4 5 ...

1968 Commits (a74f4204cdc54ea1343c410252ead0cfa66db997)