ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	988b5a81d3	take-over-existing-cluster: do not call var_files We were using var_files long ago when default variables were not in ceph-defaults, now the role exists this is not need. Moreover having these two var files added: - roles/ceph-defaults/defaults/main.yml - group_vars/all.yml Will create collision and override necessary variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1555305 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `b738706810`)	2018-08-20 14:47:32 +02:00
Markos Chandras	b2de642c8e	roles: ceph-defaults: Delegate cluster information task to monitor node Since commit `f422efb1d6` ("config: ensure rgw section has the correct name") we observe the following failures in new Ceph deployment with OpenStack-Ansible fatal: [aio1_ceph-rgw_container-fc588f0a]: FAILED! => {"changed": false, "cmd": "ceph --cluster ceph -s -f json", "msg": "[Errno 2] No such file or directory" This is because the task executes 'ceph' but at this point no package installation has happened. Packages are normally installed in the 'ceph-common' role which runs after the 'ceph-defaults' one. Since we are looking to obtain cluster information, the task should be delegated to a monitor node similar to other tasks in that role Signed-off-by: Markos Chandras <mchandras@suse.de> (cherry picked from commit `37e50114de`)	2018-08-20 14:18:07 +02:00
Markos Chandras	e9433afd6c	roles: ceph-defaults: Check if 'rgw' attribute exists for rgw_hostname If there are no services on the cluster, then the 'rgw' could be missing and the task is failing with the following problem: msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'rgw' We fix this by checking the existence of the 'rgw' attribute. If it's missing, we skip the task since the role already contains code to set a good default rgw_hostname. Signed-off-by: Markos Chandras <mchandras@suse.de> (cherry picked from commit `126e2e3f92`)	2018-08-20 14:18:07 +02:00
Dardo D Kleiner	2c77e1ac4e	mgr: improve/fix disabled modules check Follow up on `36942af698` "disabled_modules" is always a list, it's the items in the list that can be dicts in mimic. Many ways to fix this, here's one. Signed-off-by: Dardo D Kleiner <dardokleiner@gmail.com> (cherry picked from commit `f6519e4003`)	2018-08-20 11:49:30 +00:00
Andrew Schoen	f183be0328	lv-create: use copy instead of the template module The copy module does in fact do variable interpolation so we do not need to use the template module or keep a template in the source. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `04df3f0802`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	1decd53eb0	tests: cat the contents of lv-create.log in infra_lv_create Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `f5a4c89869`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	6081aea5a1	lv-create: add an example logfile_path config option in lv_vars.yml Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `131796f275`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	c119150946	tests: adds a testing scenario for lv-create and lv-teardown Using an explicitly named testing environment name allows us to have a specific [testenv] block for this test. This greatly simplifies how it will work as it doesn't really anything from the ceph cluster tests. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `810cc47892`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	634cc14393	lv-teardown: fail silently if lv_vars.yml is not found This allows user to opt out of using lv_vars.yml and load configuration from other sources. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `b0bfc17351`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	09e4ef3371	lv-teardown: set become: true at the playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `8424858b40`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	293aaaf758	lv-create: fail silenty if lv_vars.yml is not found If a user decides to to use the lv_vars.yml file then it should fail silenty so that configuration can be picked up from other places. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `e43eec57bb`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	2648751488	lv-create: set become: true at the playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `fde47be13c`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Andrew Schoen	9af842467e	lv-create: use the template module to write log file The copy module will not expand the template and render the variables included, so we must use template. Creating a temp file and using it locally means that you must run the playbook with sudo privledges, which I don't think we want to require. This introduces a logfile_path variable that the user can use to control where the logfile is written to, defaulting to the cwd. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `35301b35af`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	7f44244d23	infrastructure-playbooks/vars/lv_vars.yaml: minor fixes Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `909b38da82`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	db0e06cbb6	infrastructure-playbooks/lv-create.yml: use tempfile to create logfile Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `f65f3ea89f`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	89d950fd3c	infrastructure-playbooks/lv-create.yml: add lvm_volumes to suggested paste Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `65fdad0723`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	1a0f7baf21	infrastructure-playbooks/lv-create.yml: copy without using a template file Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `50a6d8141c`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	f1245e6011	infrastructure-playbooks/lv-create.yml: don't use action to copy Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `186c4e11c7`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	21902f0113	infrastructure-playbooks: standardize variable usage with a space after brackets Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `9d43806df9`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Neha Ojha	fb06c6cb80	vars/lv_vars.yaml: remove journal_device Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `e0293de3e7`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Ali Maredia	10da777634	infrastructure-playbooks: playbooks for creating LVs for bucket indexes and journals These playbooks create and tear down logical volumes for OSD data on HDDs and for a bucket index and journals on 1 NVMe device. Users should follow the guidelines set in var/lv_vars.yaml After the lv-create.yml playbook is run, output is sent to /tmp/logfile.txt for copy and paste into osds.yml Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `1f018d8612`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 17:01:41 +02:00
Sébastien Han	28fc45e346	Revert "osd: generate device list for osd_auto_discovery on rolling_update" This reverts commit `e84f11e99e`. This commit was giving a new failure later during the rolling_update process. Basically, this was modifying the list of devices and started impacting the ceph-osd itself. The modification to accomodate the osd_auto_discovery parameter should happen outside of the ceph-osd. Also we are trying to not play ceph-osd role during the rolling_update process so we can speed up the upgrade. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `3149b2564f`)	2018-08-16 13:35:23 +00:00
Sébastien Han	6f1499800f	rolling_update: register container osd units Before running the upgrade, let's call systemd to collect unit names instead of relaying on the device list. This is more accurate and fix the osd_auto_discovery scenario too. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613626 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `dad10e8f3f`)	2018-08-16 13:35:23 +00:00
Sébastien Han	51de29046b	contrib: fix generate group_vars samples For ceph-iscsi-gw and ceph-rbd-mirror roles the group_name are named differently (by default) than the role name so we have to change the script to generate the correct name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1602327 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `315ab08b16`)	2018-08-14 17:51:41 +00:00
Jeffrey Zhang	19c7ca1983	Use /var/lib/ceph/osd folder to filter osd mount point In some case, use may mount a partition to /var/lib/ceph, and umount it will be failure and no need to do so too. Signed-off-by: Jeffrey Zhang <zhang.lei.fly@gmail.com> (cherry picked from commit `85cc61a6d9`)	2018-08-14 14:55:56 +00:00
Mike Christie	c44638ae7e	stable 3.1 igw: add api setting support Port the parts of this upstream commit: commit `91bf53ee93` Author: Sébastien Han <seb@redhat.com> Date: Fri Mar 23 11:24:56 2018 +0800 ceph-iscsi: support for containerize deployment that allows configuration of API settings in roles/ceph-iscsi-gw/templates/iscsi-gateway.cfg.j2 using the iscsi-gws.yml. This fixes Red Hat BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1613963 Signed-off-by: Mike Christie <mchristi@redhat.com>	2018-08-14 10:23:12 +02:00
Mike Christie	2b76e3771d	stable 3.1 igw: enable and start rbd-target-api Backport https://github.com/ceph/ceph-ansible/pull/2984 to stable 3.1. From upstream commit: commit `1164cdc002` Author: Guillaume Abrioux <gabrioux@redhat.com> Date: Thu Aug 2 11:58:47 2018 +0200 iscsigw: install ceph-iscsi-cli package installs the cli package but does not start and enable the rbd-target-api daemon needed for gwcli to communicate with the igw nodes. This just enables and starts it. This fixes Red Hat BZ https://bugzilla.redhat.com/show_bug.cgi?id=1613963. Signed-off-by: Mike Christie <mchristi@redhat.com>	2018-08-14 10:23:12 +02:00
Sébastien Han	e7596d565f	group_vars: resync missing options resync group_vars file with the defaults/main.yml files. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `2dd75a1e6e`)	2018-08-13 18:55:06 +02:00
Guillaume Abrioux	904a0a4017	fail if fqdn deployment attempted fqdn configuration possibility caused a lot of trouble, it's adding a lot of complexity because of multiple cases and the relation between ceph-ansible and ceph-container. Moreover, there is no benefit for such a feature. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613155 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-13 18:55:06 +02:00
Guillaume Abrioux	97cf08e897	config: ensure rgw section has the correct name the ceph.conf.j2 always assumes the hostname used to register the radosgw in the servicemap is equivalent to `{{ ansible_hostname }}` which returns the shortname form. We need to detect which form of the hostname was used in case of already deployed cluster and update the ceph.conf accordingly. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1580408 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f422efb1d6`)	2018-08-13 18:55:06 +02:00
Guillaume Abrioux	95c28e78d1	mgr: backward compatibility for module management Follow up on `3abc253fec` The structure had even changed within `luminous` release. It was first: ``` { "enabled_modules": [ "balancer", "dashboard", "restful", "status" ], "disabled_modules": [ "influx", "localpool", "prometheus", "selftest", "zabbix" ] } ``` Then it changed for: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ "balancer", "dashboard", "influx", "localpool", "prometheus", "restful", "selftest", "zabbix" ] } ``` and finally: ``` { "enabled_modules": [ "status" ], "disabled_modules": [ { "name": "balancer", "can_run": true, "error_string": "" }, { "name": "dashboard", "can_run": true, "error_string": "" } ] } ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `36942af698`)	2018-08-13 16:05:21 +00:00
Guillaume Abrioux	9a013ab333	tests: resync iscsigw group name with master let's align the name of that group in stable-3.1 with master branch. Not having the same group name on different branches is confusing and make some nightlies job failing in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-13 12:24:59 +02:00
Guillaume Abrioux	32ef06e80f	tests: fix a typo in testinfra for iscsigws and jewel scenario group name for iscsi-gw nodes in testing is `iscsi-gws`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-08-13 12:24:59 +02:00
Sébastien Han	8ea9d14050	osd: generate device list for osd_auto_discovery on rolling_update rolling_update relies on the list of devices when performing the restart of the OSDs. The task that is builind the devices list out of the ansible_devices dict only runs when there are no partitions on the drives. However during an upgrade the OSD are already configured, they have been prepared and have partitions so this task won't run and thus the devices list will be empty, skipping the restart during rolling_update. We now run the same task under different requirements when rolling_update is true and build a list when: * osd_auto_discovery is true * rolling_update is true * ansible_devices exists * no dm/lv are part of the discovery * the device is not removable * the device has more than 1 sector Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613626 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `e84f11e99e`)	2018-08-10 16:30:40 +02:00
Sébastien Han	4785799110	rolling_update: add role ceph-iscsi-gw Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1575829 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `e91648a7af`)	2018-08-10 14:38:19 +02:00
Sébastien Han	12083bdab4	mon: fix calamari initialisation If calamari is already installed and ceph has been upgraded to a higher version the initialisation will fail later. So if we detect the calamari-server is too old compare to ceph_rhcs_version we try to update it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1601755 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `4c9e24a90f`)	2018-08-10 14:15:16 +02:00
Sébastien Han	651058bd1b	rgw: remove useless condition The include does not need a condition on containerized_deployment since we are already in an include than has the same condition. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `5a89479abe`)	2018-08-09 15:38:17 +02:00
Sébastien Han	eba9547a6e	rgw: remove unused file copy_configs.yml was not including and is a leftover so let's remove it. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `3bce117de2`)	2018-08-09 15:38:17 +02:00
Sébastien Han	a16dc0e1de	rgw: ability to use ceph-ansible vars into containers Since the container now simply reads the ceph.conf, we remove all the unnecessary options. Also this PR is the foundation to support multiple backend, such as the new 'beast' from Ceph Mimic. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1582411 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `4d64dd4686`) # Conflicts: # roles/ceph-rgw/tasks/docker/main.yml	2018-08-09 15:38:17 +02:00
Ken Dreyer	1a2c6a3572	common: upgrade/install ceph-test deb first When we deploy a Jewel cluster on Ubuntu with ceph_test: True, we're unable to upgrade that cluster to Luminous. "apt-get install ceph-common" fails to upgrade to luminous if a jewel ceph-test package is installed: Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: ceph-base : Breaks: ceph-test (< 12.2.2-14) but 10.2.11-1xenial is to be installed ceph-mon : Breaks: ceph-test (< 12.2.2-14) but 10.2.11-1xenial is to be installed In ceph-ansible master, we resolve this whole class of problem by installing all the packages in one operation (see `b338fafd90`). For the stable-3.1 branch, take a less-invasive approach, and upgrade ceph-test prior to any other package. This matches the approach I took for RPMs in `3752cc6f38`, before we had the better solution in `b338fafd90`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1610997 Signed-off-by: Ken Dreyer <kdreyer@redhat.com>	2018-08-09 14:39:33 +02:00
Graeme Gillies	19958f5c27	Allow mgr bootstrap keyring to be defined In environments where we wish to have manual/greater control over how the bootstrap keyrings are used, we need to able to externally define what the mgr keyring secret will be and have ceph-ansible use it, instead of it being autogenerated Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1610213 Signed-off-by: Graeme Gillies <ggillies@akamai.com> (cherry picked from commit `a46025820d`)	2018-08-09 08:25:27 +00:00
Sébastien Han	b00d2d0439	Resync rhcs_edits.txt We were missing an option so let's add it back. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1519835 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `19518656a7`)	2018-08-08 15:54:32 +02:00
Sébastien Han	a31ce962f7	test: remove osd_crush_location from shrink scenarios This is not needed since this is already covered by docker_cluster and centos_cluster scenarios. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `50be3fd9e8`)	2018-08-07 19:09:58 +00:00
Sébastien Han	b76c7c3afe	test: follow up on osd_crush_location for containers This was fixed by `578aa5c2d5` on non-container, we need to apply the same fix for containers. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `77d4023fbe`)	2018-08-07 19:09:58 +00:00
Guillaume Abrioux	9403a3df09	iscsigw: install ceph-iscsi-cli package Install ceph-iscsi-cli in order to provide the `gwcli` command tool. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1602785 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1164cdc002`)	2018-08-07 09:46:25 +02:00
Artur Fijalkowski	290035171f	Fix in regular expression matching OSD ID on non-contenerized deployment. restart_osd_daemon.sh is used to discover and restart all OSDs on a host. To do it the scripts loops the list of ceph-osd@ services in the system. This commit fixes bug in the regular expression responsile for extraction of OSDs - prior version uses `[0-9]{1,2}` expression which is ignoring all OSDS which numbers are greater than 99 (thus longer than 2 digits). Fix removed upper limit of digits in the number. This problem existed in two places in the script. Closes: #2964 Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com> (cherry picked from commit `52d9d406b1`)	2018-08-06 18:50:39 +00:00
Guillaume Abrioux	706d0b8289	defaults: backward compatibility with fqdn deployments This commit ensures we are backward compatible with fqdn deployments. Since ceph-container enforces deployment to be done with shortname, we must keep backward compatibility with clusters already deployed with fqdn configuration Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a6ff6bbf8`)	2018-08-06 14:09:35 +00:00
Sébastien Han	31dd4eeecf	rolling_update: set osd sortbitwise upgrade RHCS 2 -> RHCS 3 will fail if cluster has still set sortnibblewise, it stay stuck on "TASK [waiting for clean pgs...]" as RHCS 3 osds will not start if nibblewise is set. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600943 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `b3266c5be2`)	2018-08-02 14:53:06 +00:00
Sébastien Han	2d5ed5ef8e	config: enforce socket name This was introduced by `59ee2e8d3b` and made our socket checks impossible to run. The PID could be found, but the cctid cannot. This happens during upgrade to mimic and on cluster running on mimic. So let's force the admin socket the way it was so we can properly check for existing instances also the line $cluster-$name.$pid.$cctid.asok is only needed when running multiple instances of the same daemon, thing ceph-ansible cannot do at the time of writing Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1610220 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `ea9e60d48d`)	2018-08-02 12:34:48 +00:00
Guillaume Abrioux	826da2c385	tests: support update scenarios in test_rbd_mirror_is_up() `test_rbd_mirror_is_up()` is failing on update scenarios because it assumes the `ceph_stable_release` is still set to the value of the original ceph release, it means it won't enter in the right part of the condition and fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d8281e50f1`)	2018-08-02 10:06:55 +00:00

1 2 3 4 5 ...

3771 Commits (988b5a81d3aafbe581436100d7b8c6ee7ea8ffe2) All Branches Search

3771 Commits (988b5a81d3aafbe581436100d7b8c6ee7ea8ffe2)

All Branches