ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	b3266c5be2	rolling_update: set osd sortbitwise upgrade RHCS 2 -> RHCS 3 will fail if cluster has still set sortnibblewise, it stay stuck on "TASK [waiting for clean pgs...]" as RHCS 3 osds will not start if nibblewise is set. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600943 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-24 17:19:02 +02:00
Guillaume Abrioux	0a88bccf87	tests: followup on `b89cc1746f` Update network subnets in group_vars/all Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-24 16:55:15 +02:00
Guillaume Abrioux	b89cc1746f	tests: do not deploy all daemons for shrink osds scenarios Let's create a dedicated environment for these scenarios, there is no need to deploy everything. By the way, doing so will save some times. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 18:30:06 +02:00
Guillaume Abrioux	af82e7523d	tests: test master against ansible 2.6 Ansible 2.4 is currently end-of-life. Ansible 2.5 will go end-of-life after Ansible 2.7 is released. Fixes: #2901 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 11:59:15 +00:00
Guillaume Abrioux	0c863a3783	tests: add support of 'ooo-collocation' scenario when testing against ceph dev The group_vars/all file is not available on 'ooo-collocation' scenario, it's making the `dev_setup.yml` failing because this path is hardcoded. The idea here is to check if the pattern 'ooo-collocation' is present in `change_dir` variable so we can set this path properly according to the scenario being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:47:33 +02:00
Guillaume Abrioux	d8281e50f1	tests: support update scenarios in test_rbd_mirror_is_up() `test_rbd_mirror_is_up()` is failing on update scenarios because it assumes the `ceph_stable_release` is still set to the value of the original ceph release, it means it won't enter in the right part of the condition and fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:46:41 +02:00
Guillaume Abrioux	651ee469f6	tests: stop hardcoding ansible version In addition to ceph/ceph-build#1082 Let's set the ansible version in each ceph-ansible branch's respective requirements.txt. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-19 23:10:37 +02:00
Sébastien Han	7fc13bc9d5	validate: only run osd test on osd node Do not run device validation on every hosts, only on OSD nodes. Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-19 12:46:18 +00:00
Guillaume Abrioux	774980227a	docs: update doc - stable-2.1 and stable-2.2 shouldn't be referenced anymore. - add stable-3.1 branch reference - update the differents ansible version supported Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-18 17:49:44 +02:00
Sébastien Han	ce1dd8d2b3	shrink-osd: purge osd on containerized deployment Prior to this commit we were only stopping the container, but now we also purge the devices. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572933 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-18 14:26:22 +00:00
Sébastien Han	cf01e596b6	valide: improve device check We know make sure that: * devices are actually block special files * length of dedicated_device is identical to devices Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-18 14:26:22 +00:00
Guillaume Abrioux	05852b0301	tests: add latest-bis-jewel for jewel tests since no latest-bis-jewel exists, it's using latest-bis which points to ceph mimic. In our testing, using it for idempotency/handlers tests means upgrading from jewel to mimic which is not what we want do. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-17 09:10:15 +00:00
Guillaume Abrioux	1a626d3c61	nfs: change default stable branch for nfs-ganesha repo Since `V2.6-stable` is available and has packages for `mimic`, let's update this default value accordingly so nfs nodes can be deployed with mimic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 08:20:27 +00:00
Guillaume Abrioux	cc71bb96cc	tests: followup on #2656 `34f70428` has introduced a fix using `command` module while this could have been achieved by using `lvol` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 07:55:14 +00:00
Sébastien Han	e61ca882a1	validate: force ansible version We currently only support Ansible 2.4.X so let's fail if the version is different. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-13 07:52:56 +00:00
Guillaume Abrioux	5ef5fcd0b6	client: do not rely on copy_admin_key to import keys Relying on `copy_admin_key` to import created keys on client nodes makes us obliged to copy admin key on those nodes which is not something we might want. We should use the fact `condition_copy_admin_key` which will be set to `True` when the delegated node is a mon which means we can import keys without taking care of admin keyring. Fixes: #2867 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 06:52:00 +00:00
Andy McCrae	a1b3d5b7c3	Sync config_template with upstream for Ansible 2.6 The original_basename option in the copy module changed to be _original_basename in Ansible 2.6+, this PR resyncs the config_template module to allow this to work with both Ansible 2.6+ and before. Additionally, this PR removes the _v1_config_template.py file, since ceph-ansible no longer supports versions of Ansible before version 2, and so we shouldn't continue to carry that code. Closes: #2843 Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>	2018-07-12 21:07:41 +00:00
Guillaume Abrioux	ce5ac930c5	mgr: fix condition to add modules to ceph-mgr Follow up on #2784 We must check in the generated fact `_disabled_ceph_mgr_modules` to enable disabled mgr module. Otherwise, this task will be skipped because it's not comparing the right list. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600155 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-12 21:04:01 +00:00
Sébastien Han	c45195a18a	Update issue templates Introduce templates for issues and feature requests. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-12 14:10:15 +02:00
Guillaume Abrioux	9f54b3b4a7	mon: ensure socker is purged when mon is stopped On containerized deployment, if a mon is stopped, the socket is not purged and can cause failure when a cluster is redeployed after the purge playbook has been run. Typical error: ``` fatal: [osd0]: FAILED! => {} MSG: 'dict object' has no attribute 'osd_pool_default_pg_num' ``` the fact is not set because of this previous failure earlier: ``` ok: [mon0] => { "changed": false, "cmd": "docker exec ceph-mon-mon0 ceph --cluster test daemon mon.mon0 config get osd_pool_default_pg_num", "delta": "0:00:00.217382", "end": "2018-07-09 22:25:53.155969", "failed_when_result": false, "rc": 22, "start": "2018-07-09 22:25:52.938587" } STDERR: admin_socket: exception getting command descriptions: [Errno 111] Connection refused MSG: non-zero return code ``` This failure happens when the ceph-mon service is stopped, indeed, since the socket isn't purged, it's a leftover which is confusing the process. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 20:08:07 +00:00
Guillaume Abrioux	d0746e0858	common: switch from docker module to docker_container As of ansible 2.4, `docker` module has been removed (was deprecated since ansible 2.1). We must switch to `docker_container` instead. See: https://docs.ansible.com/ansible/latest/modules/docker_module.html#docker-module Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 20:08:07 +00:00
Guillaume Abrioux	9a65ec231d	tests: fix `_get_osd_id_from_host()` in TestOSDs() We must initialize `children` variable in `_get_osd_id_from_host()`, otherwise, if for any reason the deployment has failed and result with an osd host with no OSD registered, we won't enter in the condition, therefore, `children` is never set and the function tries to return something undefined. Typical error: ``` E UnboundLocalError: local variable 'children' referenced before assignment ``` Fixes: #2860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 13:06:23 +00:00
Shilpa Jagannath	07852ed039	Remove zone from zonegroup and update period before deleting the zone to avoid inconsistent period information across other zones. When you delete a zone without removing from zonegroup, the period update would fail since that command needs to load the zone and zonegroup to be able to update the master. Period update would fail with an error like this: radosgw-admin period update --commit -1 Cannot find zone id= (name=), switching to local zonegroup configuration -1 Cannot find zone id= (name=) Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>	2018-07-09 12:27:24 +00:00
Sébastien Han	b9f7df7ba2	common: remove hdparm As of Kraken, the journal code does not use the hdparm command anymore so we can remove it from our package dependency list. Fixes: https://github.com/ceph/ceph-ansible/issues/1402 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit f6910efa24389c264062963b2054c7cd29ffebb3)	2018-07-07 08:53:47 +00:00
Guillaume Abrioux	103529c172	tox: test mimic deployment Let's try to deploy mimic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d2cdfd830c`)	2018-07-06 16:31:49 +00:00
Guillaume Abrioux	b6d09b510f	tests: refact ci testing master We should test ceph-ansible against the latest ansible stable version on master. This commit also remove the pinning to 1.7.1 version of testinfra because ansible 2.5 requires a newer version. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-06 16:31:49 +00:00
Guillaume Abrioux	09d795b5b7	tests: add mimic support for test_rbd_mirror_is_up() prior mimic, the data structure returned by `ceph -s -f json` used to gather information about rbd-mirror daemons looked like below: ``` "servicemap": { "epoch": 8, "modified": "2018-07-05 13:21:06.207483", "services": { "rbd-mirror": { "daemons": { "summary": "", "ceph-nano-luminous-faa32aebf00b": { "start_epoch": 8, "start_stamp": "2018-07-05 13:21:04.668450", "gid": 14107, "addr": "172.17.0.2:0/2229952892", "metadata": { "arch": "x86_64", "ceph_version": "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)", "cpu": "Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-nano-luminous-faa32aebf00b", "instance_id": "14107", "kernel_description": "#1 SMP Wed Mar 14 15:12:16 UTC 2018", "kernel_version": "4.9.87-linuxkit-aufs", "mem_swap_kb": "1048572", "mem_total_kb": "2046652", "os": "Linux" } } } } } } ``` This part has changed from mimic and became: ``` "servicemap": { "epoch": 2, "modified": "2018-07-04 09:54:36.164786", "services": { "rbd-mirror": { "daemons": { "summary": "", "14151": { "start_epoch": 2, "start_stamp": "2018-07-04 09:54:35.541272", "gid": 14151, "addr": "192.168.1.80:0/240942528", "metadata": { "arch": "x86_64", "ceph_release": "mimic", "ceph_version": "ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)", "ceph_version_short": "13.2.0", "cpu": "Intel(R) Xeon(R) CPU X5650 @ 2.67GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-rbd-mirror0", "id": "ceph-rbd-mirror0", "instance_id": "14151", "kernel_description": "#1 SMP Wed May 9 18:05:47 UTC 2018", "kernel_version": "3.10.0-862.2.3.el7.x86_64", "mem_swap_kb": "1572860", "mem_total_kb": "1015548", "os": "Linux" } } } } } } ``` This patch modifies the function `test_rbd_mirror_is_up()` in `test_rbd_mirror.py` so it works with `mimic` and keeps backward compatibility with `luminous` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-06 14:39:13 +02:00
Sébastien Han	713b9fcf9b	ceph-config: do not log cluster log on container The container image recently merged both cluster and mon log into a single stream. Following this, we now see this warning coming from the container image: 2018-06-19 13:44:01.542990 7ff75b024700 1 mon.vm02@1(peon).log v57928205 unable to write to '/var/log/ceph/ceph.log' for channel 'cluster': (2) No such file or directory So we now tell the mon to not log cluster log on the filesystem. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1591771 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-05 15:11:45 +00:00
Sébastien Han	fcf11ecc35	ceph-common: fix rhcs condition We forgot to add mgr_group_name when checking for the mon repo, thus the conditional on the next task was failing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1598185 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-04 17:17:21 +02:00
Christian Berendt	f8fd590fe8	docs: make "OSD Configuration" a subsection "OSD Configuration" should be part of the "Configuration and Usage" section. Signed-off-by: Christian Berendt <berendt@b1-systems.de>	2018-07-04 13:05:57 +02:00
Christian Berendt	0fdb2c4fd4	docs: change github/Github to GitHub Signed-off-by: Christian Berendt <berendt@b1-systems.de>	2018-07-04 13:05:27 +02:00
Christian Berendt	a8b163a84f	docs: use apt instead of apt-get Signed-off-by: Christian Berendt <berendt@b1-systems.de>	2018-07-04 10:48:44 +00:00
Guillaume Abrioux	3abc253fec	mgr: fix enabling of mgr module on mimic The data structure has slightly changed on mimic. Prior to mimic, it used to be: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ "balancer", "dashboard", "influx", "localpool", "prometheus", "restful", "selftest", "zabbix" ] } ``` From mimic it looks like this: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ { "name": "balancer", "can_run": true, "error_string": "" }, { "name": "dashboard", "can_run": true, "error_string": "" } ] } ``` This means we can't simply check if `item` is in `item in _ceph_mgr_modules.disabled_modules` the idea here is to use filter `map(attribute='name')` to build a list when deploying mimic. Fixes: #2766 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-03 21:19:16 +00:00
Vishal Kanaujia	44d514850a	Rolling upgrades: Migrate to ceph-key module This change moves ceph-mgr upgrades to using ceph-key library. Fixes: #2758 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-07-03 18:22:14 +02:00
Sébastien Han	63658c05c7	ceph-client: do not kill the dummy container The container runs for 300 sec, then dies and removes itself thanks to the '--rm' option, so there is no point of removing it. Also this is causing failure under some circonstances. Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1568157 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-03 16:09:52 +00:00
Sébastien Han	96b187b30e	ci: remove DCO We know a Signed-off-by check inside our pipeline so this bot is not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-02 16:08:18 +02:00
Sébastien Han	a629408967	ceph-mds: enable application pool We now enable the application type 'cephfs' for each cephfs pools we create. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590275 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-02 10:28:34 +00:00
Sébastien Han	103c279c21	ceph-defaults: add default application to pool We now add a default 'rbd' application type to each pool we create. This will remove the warning: " application not enabled on N pool(s) " Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590275 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-07-02 10:28:34 +00:00
Vasu Kulkarni	1d454b611f	Enable monitor repo for mgr nodes and Tools repo for iscsi/nfs/clients Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2018-06-29 18:09:26 +00:00
Andy McCrae	eb836e7c31	Sync config_template with upstream Some fixes have gone into git.openstack.org/openstack/ansible-config_template to deal with a few bugs we have run into. This PR brings the ceph-ansible config_template version up to the same as the ansible-config_template openstack repo. Closes: #2742 Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>	2018-06-29 09:41:13 +00:00
Sébastien Han	abdb53e16a	ceph-osd: trigger osd container restart on script change The script ceph-osd-run.sh holds the config options to start the container, if one of these options are modified we must restart the container. This was not the case before becauase the 'notify' flag wasn't present. Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1596061 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-28 17:54:13 +02:00
Sébastien Han	f623997271	systemd: remove changed_when: false When using a module there is no need to apply this Ansible option. The module will handle the idempotency on its own. So the module decides wether or not the task has changed during the execution. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-28 17:54:13 +02:00
George Shuklin	653b483fc3	Add ceph_keyring_permissions variable to control permissions for keyring files in /etc/ceph. Default value is the same as it was (0600), but this variable allows user to override it (f.e. set it to 0640). Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2018-06-28 15:48:39 +00:00
Ha Phan	a7b7735b6f	ceph-mon: Generate initial keyring Minor fix so that initial keyring can be generated using python3. Signed-off-by: Ha Phan <thanhha.work@gmail.com>	2018-06-28 10:39:56 +02:00
Ha Phan	b7b8aba47b	Generate a copy of ceph.conf locally Refers to #2697 This change creates a copy of `ceph.conf` in ansible server. Signed-off-by: Ha Phan <thanhha.work@gmail.com>	2018-06-28 07:39:30 +00:00
Andy McCrae	a4a3d9a01b	Fix package state for upgrades on SuSE/RHEL During `226f80c22b` only Debian package installs had the correct state set to ensure packages were upgraded when the "upgrade_ceph_packages" var was set to true. Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>	2018-06-27 18:55:22 +00:00
Sébastien Han	322e2de7d2	mon: honour mon_docker_net_host option --net=host was hardcoded in the startup line so even though mon_docker_net_host was set to False the net option would always be activated. mon_docker_net_host is set to True by default so this commit does not change the behaviour. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-27 13:44:41 +00:00
Guillaume Abrioux	081600842f	tests: reduce the amount of time we wait This sleep 120 looks a bit long, let's reduce this to 30sec and see if things go faster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-27 10:13:26 +00:00
Guillaume Abrioux	f2e57a56db	tests: factorize docker tests using docker_exec_cmd logic avoid duplicating test unnecessarily just because of docker exec syntax. Using the same logic than in the playbook with `docker_exec_cmd` allow us to execute the same test on both containerized and non containerized environment. The idea is to set a variable `docker_exec_cmd` with the 'docker exec <container-name>' string when containerized and set it to '' when non containerized. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-27 07:00:14 +00:00
Guillaume Abrioux	fe79a5d240	tests: refact test_all__osds_are_up_and_in these tests are skipped on bluestore osds scenarios. they were going to fail anyway since they are run on mon nodes and `devices` is defined in inventory for each osd node. It means `num_devices num_osd_hosts` returns `0`. The result is that the test expects to have 0 OSDs up. The idea here is to move these tests so they are run on OSD nodes. Each OSD node checks their respective OSD to be UP, if an OSD has 2 devices defined in `devices` variable, it means we are checking for 2 OSD to be up on that node, if each node has all its OSD up, we can say all OSD are up. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00

1 2 3 4 5 ...

3810 Commits (b3266c5be2f88210589cfa56a5fe0a5092f79ee6) All Branches Search

3810 Commits (b3266c5be2f88210589cfa56a5fe0a5092f79ee6)

All Branches