ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	cd6ef8e9ec	tests: skip disabling fastest mirror detection on atomic host There is no need to execute this task on atomic hosts. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f0cd4b0651`)	2018-06-05 16:02:54 +02:00
Erwan Velu	51a7eb5a70	ceph-defaults: Enable local epel repository During the tests, the remote epel repository is generating a lots of errors leading to broken jobs (issue #2666) This patch is about using a local repository instead of a random one. To achieve that, we make a preliminary install of epel-release, remove the metalink and enforce a baseurl to our local http mirror. That should speed up the build process but also avoid the random errors we face. This patch is part of a patch series that tries to remove all possible yum failures. Signed-off-by: Erwan Velu <erwan@redhat.com> (cherry picked from commit `493f615eae`)	2018-06-05 16:02:54 +02:00
Andy McCrae	c90535ecce	Fix template reference for ganesha.conf We can simply reference the template name since it exists within the role that we are calling. We don't need to check the ANSIBLE_ROLE_PATH or playbooks directory for the file. Signed-off-by: Lionel Sausin <ls@initiatives.fr>	2018-06-04 10:21:17 +02:00
Andrew Schoen	53dfd050c5	ceph-defaults: add the nautilus 14.x entry to ceph_release_num The first 14.x tag has been cut so this needs to be added so that version detection will still work on the master branch of ceph. Fixes: https://github.com/ceph/ceph-ansible/issues/2671 Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `c2423e2c48`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 19:21:59 +02:00
Guillaume Abrioux	28319698e2	mons: move set_fact of openstack_keys in ceph-osd Since the openstack_config.yml has been moved to `ceph-osd` we must move this `set_fact` in ceph-osd otherwise the tasks in `openstack_config.yml` using `openstack_keys` will actually use the defaults value from `ceph-defaults`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1585139 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `aae37b44f5`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 18:37:18 +02:00
Guillaume Abrioux	9c91bb8b2c	osds: wait for osds to be up before creating pools This is a follow up on #2628. Even with the openstack pools creation moved later in the playbook, there is still an issue because OSDs are not all UP when trying to create pools. Adding a task which checks for all OSDs to be UP with a `retries/until` condition should definitively fix this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d5265fe11`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 17:07:35 +02:00
Guillaume Abrioux	01701eed96	Makefile: followup on #2585 Fix a typo in `tag` target, double quote are missing here. Without them, the `make tag` command fails like this: ``` if [[ "v3.0.35" == ]]; then \ echo "e5f2df8 on stable-3.0 is already tagged as v3.0.35"; \ exit 1; \ fi /bin/sh: -c: line 0: unexpected argument `]]' to conditional binary operator /bin/sh: -c: line 0: syntax error near `;' /bin/sh: -c: line 0: `if [[ "v3.0.35" == ]]; then echo "e5f2df8 on stable-3.0 is already tagged as v3.0.35"; exit 1; fi' make: *** [tag] Error 2 ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0b67f42feb`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 12:51:53 +02:00
Ken Dreyer	b035697246	Makefile: add "make tag" command Add a new "make tag" command. This automates some common operations: 1) Automatically determine the next Git tag version number to create. For example: "3.2.0beta1 -> "3.2.0beta2" "3.2.0rc1 -> "3.2.0rc2" "3.2.0" -> "3.2.1" 2) Create the Git tag, and print instructions for the user to push it to GitHub. 3) Sanity check that HEAD is a stable-* branch or master (bail on everything else). 4) Sanity check that HEAD is not already tagged. Note, we will still need to tag manually once each time we change the format, for example when moving from tagging "betas" to tagging "rcs", or "rcs" to "stable point releases". Signed-off-by: Ken Dreyer <kdreyer@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fcea568495`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-31 11:15:12 +02:00
Sébastien Han	2ac720d2c2	rgw: container add option to configure multi-site zone You can now use RGW_ZONE and RGW_ZONEGROUP on each rgw host from your inventory and assign them a value. Once the rgw container starts it'll pick the info and add itself to the right zone. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1551637 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `1c084efb3c`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-31 11:10:01 +02:00
Guillaume Abrioux	4f0850adf1	mon: remove check on pg_num for cephfs_pools It should have been backported from `29a9dff` but for better clarity I think it's better to create a new commit for this. `c68126d6` aims to not make `pgs` attribute mandatory for each element of `cephfs_pools`. Therefore, we must remove the check in `roles/ceph-mon/tasks/check_mandatory_vars.yml`. This task has been removed by `29a9dff` but I've chosen to not backport this commit since it's part of a bunch of commits belonging to a PR implementing `ceph-validate` role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 21:45:03 +02:00
Guillaume Abrioux	4328e0b42a	mdss: do not make pg_num a mandatory params When playing ceph-mds role, mon nodes have set a fact with the default pg num for osd pools, we can simply default to this value for cephfs pools (`cephfs_pools` variable). At the moment the variable definition for `cephfs_pools` looks like: ``` cephfs_pools: - { name: "{{ cephfs_data }}", pgs: "" } - { name: "{{ cephfs_metadata }}", pgs: "" } ``` and we have a task in `ceph-validate` to ensure `pgs` has been set to a valid value. We could simply avoid this check by setting the default value of `pgs` to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and let to users the possibility to override this value. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c68126d6fd`)	2018-05-30 21:45:03 +02:00
Guillaume Abrioux	77b02fe720	tests: fix broken symlink `requirements2.5.txt` is pointing to `tests/requirements2.4.txt` while it should point to `requirements2.4.txt` since they are in the same directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6f489015e4`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 20:22:58 +02:00
Guillaume Abrioux	6ee4b228ba	osds: do not set docker_exec_cmd fact in `ceph-osd` there is no need to set `docker_exec_cmd` since the only place where this fact is used is in `openstack_config.yml` which delegate all docker command to a monitor node. It means we need the `docker_exec_cmd` fact that has been set referring to `ceph-mon-*` containers, this fact is already set earlier in `ceph-defaults`. By the way, when collocating an OSD with a MON it fails because the container `ceph-osd-{{ ansible_hostname }}` doesn't exist. Removing this task will allow to collocate an OSD with a MON. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1584179 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `34e646e767`)	2018-05-30 20:20:43 +02:00
Guillaume Abrioux	e1c1017e15	tests: resize root partition when atomic host For a few moment we can see failures in the CI for containerized scenarios because VMs are running out of space at some point. The default in the images used is to have only 3Gb for root partition which doesn't sound like a lot. Typical error seen: ``` STDERR: failed to register layer: Error processing tar file(exit status 1): open /usr/share/zoneinfo/Atlantic/Canary: no space left on device ``` Indeed, on the machine we can see: ``` Every 2.0s: df -h Tue May 29 17:21:13 2018 Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 3.0G 3.0G 14M 100% / ``` The idea here is to expand this partition with all the available space remaining by issuing an `lvresize` followed by an `xfs_growfs`. ``` -bash-4.2# lvresize -l +100%FREE /dev/atomicos/root Size of logical volume atomicos/root changed from <2.93 GiB (750 extents) to 9.70 GiB (2484 extents). Logical volume atomicos/root successfully resized. ``` ``` -bash-4.2# xfs_growfs / meta-data=/dev/mapper/atomicos-root isize=512 agcount=4, agsize=192000 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0 spinodes=0 data = bsize=4096 blocks=768000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 data blocks changed from 768000 to 2543616 ``` ``` -bash-4.2# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 9.7G 1.4G 8.4G 14% / ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `34f7042852`)	2018-05-30 14:40:50 +02:00
Guillaume Abrioux	92f0de3792	tests: avoid yum failures In the CI we can see at many times failures like following: `Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64` It seems the fastest mirror detection is sometimes counterproductive and leads yum to fail. This fix has been added in the `setup.yml`. This playbook was used until now only just before playing `testinfra` and could be used before running ceph-ansible so we can add some provisionning tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Erwan Velu <evelu@redhat.com> (cherry picked from commit `98cb6ed8f6`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 14:40:50 +02:00
Guillaume Abrioux	220d528e8b	mds: move mds fs pools creation When collocating mds on monitor node, the cephpfs will fail because `docker_exec_cmd` is reset to `ceph-mds-monXX` which is incorrect because we need to delegate the task on `ceph-mon-monXX`. In addition, it wouldn't have worked since `ceph-mds-monXX` container isn't started yet. Moving the task earlier in the `ceph-mds` role will fix this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `608ea947a9`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-25 03:57:31 -07:00
Paul Cuzner	bdff7204f2	Add privilege escalation to iscsi purge tasks Without the escalation, invocation from non-root users with fail when accessing the rados config object, or when attempting to log to /var/log Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1549004 Signed-off-by: Paul Cuzner <pcuzner@redhat.com> (cherry picked from commit `2890b57cfc`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-25 03:52:06 -07:00
Guillaume Abrioux	0d3bce95e1	playbook: follow up on #2553 Since we fixed the `gather and delegate facts` task, this exception is not needed anymore. It's a leftover that should be removed to save some time when deploying a cluster with a large client number. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `828848017c`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:33:11 +02:00
Andrew Schoen	49f6d3cbec	ceph-defaults: move cephfs vars from the ceph-mon role We're doing this so we can validate this in the ceph-validate role Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `1f15a81c48`)	2018-05-24 21:29:42 +02:00
Sébastien Han	1fe587441f	group_vars: resync group_vars The previous commit changed the content of roles/$ROLE/default/main.yml so we have to re generate the group_vars files. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `3c32280ca1`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:29:42 +02:00
Guillaume Abrioux	683bec9eb2	mdss: move cephfs pools creation in ceph-mds When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move cephfs pools creation in `ceph-mds` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3a0e168a76`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:29:42 +02:00
Guillaume Abrioux	b00a3cf790	tests: move cephfs_pools variable let's move this variable in group_vars/all.yml in all testing scenarios accordingly to this commit `1f15a81c48` so we keep consistency between the playbook and the tests. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a10e73d78d`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:29:42 +02:00
Guillaume Abrioux	873abdbf0c	osds: move openstack pools creation in ceph-osd When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move openstack pools creation at the end of `ceph-osd` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `564a662baf`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:29:42 +02:00
Guillaume Abrioux	4487eaba40	defaults: resync sample files with actual defaults `6644dba5e3` and `1f15a81c48` introduced changes some changes in defaults variables files but it seems we've forgotten to regenerate the sample files. This commit aims to resync the content of `all.yml.sample`, `mons.yml.sample` and `rhcs.yml.sample` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f8260119cd`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 21:29:42 +02:00
Luigi Toscano	d7f0ea33c9	ceph-radosgw: disable NSS PKI db when SSL is disabled The NSS PKI database is needed only if radosgw_keystone_ssl is explicitly set to true, otherwise the SSL integration is not enabled. It is worth noting that the PKI support was removed from Keystone starting from the Ocata release, so some code paths should be changed anyway. Also, remove radosgw_keystone, which is not useful anymore. This variable was used until `fcba2c801a`. Now profiles drives the setting of rgw keystone *. Signed-off-by: Luigi Toscano <ltoscano@redhat.com> (cherry picked from commit `43e96c1f98`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-24 15:41:42 +02:00
Sébastien Han	7b2cefd9f8	rhcs: bump version to 3.0 for stable 3.1 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1519835 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `bf9593bced`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-23 14:43:32 -07:00
Vishal Kanaujia	0e0bd09b1f	Skip GPT header creation for lvm osd scenario The LVM lvcreate fails if the disk already has a GPT header. We create GPT header regardless of OSD scenario. The fix is to skip header creation for lvm scenario. fixes: https://github.com/ceph/ceph-ansible/issues/2592 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com> (cherry picked from commit `ef5f52b1f3`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-23 13:37:09 -07:00
Sébastien Han	37693870df	rolling_update: fix get fsid for containers When running ansible2.4-update_docker_cluster there is an issue on the "get current fsid" task. The current task only works for non-containerized deployment but will run all the time (even for containerized). This currently results in the following error: TASK [get current fsid] ****************************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-luminous-ansible2.4-update_docker_cluster/rolling_update.yml:214 Tuesday 22 May 2018 22:48:32 +0000 (0:00:02.615) 0:11:01.035 ********* fatal: [mgr0 -> mon0]: FAILED! => { "changed": true, "cmd": [ "ceph", "--cluster", "test", "fsid" ], "delta": "0:05:00.260674", "end": "2018-05-22 22:53:34.555743", "rc": 1, "start": "2018-05-22 22:48:34.295069" } STDERR: 2018-05-22 22:48:34.495651 7f89482c6700 0 -- 192.168.17.10:0/1022712 >> 192.168.17.12:6789/0 pipe(0x7f8944067010 sd=4 :42654 s=1 pgs=0 cs=0 l=1 c=0x7f894405d510).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000 2018-05-22 22:48:34.495684 7f89482c6700 0 -- 192.168.17.10:0/1022712 >> 192.168.17.12:6789/0 pipe(0x7f8944067010 sd=4 :42654 s=1 pgs=0 cs=0 l=1 c=0x7f894405d510).fault This is not really representative on the real error since the 'ceph' cli is available on that machine. On other environments we will have something like "command not found: ceph". Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `da5b104098`)	2018-05-22 23:22:38 -07:00
Subhachandra Chandra	747b545af4	Fix restarting OSDs twice during a rolling update. During a rolling update, OSDs are restarted twice currently. Once, by the handler in roles/ceph-defaults/handlers/main.yml and a second time by tasks in the rolling_update playbook. This change turns off restarts by the handler. Further, the restart initiated by the rolling_update playbook is more efficient as it restarts all the OSDs on a host as one operation and waits for them to rejoin the cluster. The restart task in the handler restarts one OSD at a time and waits for it to join the cluster. (cherry picked from commit `c7e269fcf5`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-22 23:22:38 -07:00
Sébastien Han	ddafad3f32	switch: disable ceph-disk units During the transition from jewel non-container to container old ceph units are disabled. ceph-disk can still remain in some cases and will appear as 'loaded failed', this is not a problem although operators might not like to see these units failing. That's why we remove them if we find them. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1577846 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `49a4712485`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-22 17:08:01 -07:00
Guillaume Abrioux	ec528b9278	purge_cluster: fix dmcrypt purge dmcrypt devices aren't closed properly, therefore, it may fail when trying to redeploy after a purge. Typical errors: ``` ceph-disk: Cannot discover filesystem type: device /dev/sdb1: Command '/sbin/blkid' returned non-zero exit status 2 ``` ``` ceph-disk: Error: unable to read dm-crypt key: /var/lib/ceph/osd-lockbox/c6e01af1-ed8c-4d40-8be7-7fc0b4e104cf: /etc/ceph/dmcrypt-keys/c6e01af1-ed8c-4d40-8be7-7fc0b4e104cf.luks.key ``` Closing properly dmcrypt devices allows to redeploy without error. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9801bde4d4`)	2018-05-22 16:44:06 +02:00
Guillaume Abrioux	17ee4e92f0	purge_cluster: wipe all partitions In order to ensure there is no leftover after having purged a cluster, we must wipe all partitions properly. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a9247c4de7`)	2018-05-22 16:44:06 +02:00
Guillaume Abrioux	7d0e072da4	purge_cluster: fix bug when building device list there is some leftover on devices when purging osds because of a invalid device list construction. typical error: ``` changed: [osd3] => (item=/dev/sda sda1) => { "changed": true, "cmd": "# if the disk passed is a raw device AND the boot system disk\n if parted -s \"/dev/sda sda1\" print \| grep -sq boot; then\n echo \"Looks like /dev/sda sda1 has a boot partition,\"\n echo \"if you want to delete specific partitions point to the partition instead of the raw device\"\n echo \"Do not use your system disk!\"\n exit 1\n fi\n echo sgdisk -Z \"/dev/sda sda1\"\n echo dd if=/dev/zero of=\"/dev/sda sda1\" bs=1M count=200\n echo udevadm settle --timeout=600", "delta": "0:00:00.015188", "end": "2018-05-16 12:41:40.408597", "item": "/dev/sda sda1", "rc": 0, "start": "2018-05-16 12:41:40.393409" } STDOUT: sgdisk -Z /dev/sda sda1 dd if=/dev/zero of=/dev/sda sda1 bs=1M count=200 udevadm settle --timeout=600 STDERR: Error: Could not stat device /dev/sda sda1 - No such file or directory. ``` the devices list in the task `resolve parent device` isn't built properly because the command used to resolve the parent device doesn't return the expected output eg: ``` changed: [osd3] => (item=/dev/sda1) => { "changed": true, "cmd": "echo /dev/$(lsblk -no pkname \"/dev/sda1\")", "delta": "0:00:00.013634", "end": "2018-05-16 12:41:09.068166", "item": "/dev/sda1", "rc": 0, "start": "2018-05-16 12:41:09.054532" } STDOUT: /dev/sda sda1 ``` For instance, it will result with a devices list like: `['/dev/sda sda1', '/dev/sdb', '/dev/sdc sdc1']` where we expect to have: `['/dev/sda', '/dev/sdb', '/dev/sdc']` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9cad113e2f`)	2018-05-22 16:44:06 +02:00
Sébastien Han	831491f7d6	defaults: restart_osd_daemon unit spaces Extra space in systemctl list-units can cause restart_osd_daemon.sh to fail It looks like if you have more services enabled in the node space between "loaded" and "active" get more space as compared to one space given in command the command[1]. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1573317 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `2f43e9dab5`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-22 09:43:10 +02:00
Michael Vollman	e1aa85f04c	Do nothing when mgr module is in good state Check whether a mgr module is supposed to be disabled before disabling it and whether it is already enabled before enabling it. Signed-off-by: Michael Vollman <michael.b.vollman@gmail.com> (cherry picked from commit `ed050bf3f6`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-18 16:53:39 +02:00
Guillaume Abrioux	fb0304230b	take-over: fix bug when trying to override variable A customer has been facing an issue when trying to override `monitor_interface` in inventory host file. In his use case, all nodes had the same interface for `monitor_interface` name except one. Therefore, they tried to override this variable for that node in the inventory host file but the take-over-existing-cluster playbook was failing when trying to generate the new ceph.conf file because of undefined variable. Typical error: ``` fatal: [srvcto103cnodep01]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute u'ansible_bond0.15'"} ``` Including variables like this `include_vars: group_vars/all.yml` prevent us from overriding anything in inventory host file because it overwrites everything you would have defined in inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1575915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `415dc0a29b`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-18 11:21:33 +02:00
Sébastien Han	1556b69e13	rolling_update: move osd flag section During a minor update from a jewel to a higher jewel version (10.2.9 to 10.2.10 for example) osd flags don't get applied because they were done in the mgr section which is skipped in jewel since this daemons does not exist. Moving the set flag section after all the mons have been updated solves that problem. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1548071 Co-authored-by: Tomas Petr <tpetr@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `d80a871a07`)	2018-05-17 11:54:12 +02:00
Guillaume Abrioux	861f4b876b	client: remove default value for pg_num in pools creation trying to set the default value for pg_num to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num'])` will break in case of external client nodes deployment. the `pg_num` attribute should be mandatory and be tested in future `ceph-validate` role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f60b049ae5`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-16 09:44:36 +02:00
Sébastien Han	1de4756338	rolling_update: move mgr key creation Until all the mons haven't been updated to Luminous, there is no way to create a key. So we should do the key creation in the mon role only if we are not part of an update. If we are then the key creation is done after the mons upgrade to Luminous. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `52fc8a0385`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-15 19:39:03 +02:00
Sébastien Han	9ca1d1d571	Revert "mon: fix mgr keyring creation when upgrading from jewel" This reverts commit `259fae931d`. (cherry picked from commit `e810fb217f`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-15 19:39:03 +02:00
Guillaume Abrioux	6338f749a9	rolling_update: fix dest path for mgr keys fetching the role `ceph-mgr` that is played later in the playbook fails because the destination path for the fetched keys is wrong. This patch fix the destination path used in the task `fetch ceph mgr key(s)` so there is no mismatch. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1b4c3f292d`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-15 19:35:09 +02:00
Guillaume Abrioux	7c7f517bba	iscsi-gw: fix issue when trying to mask target trying to mask target when `/etc/systemd/system/target.service` doesn't exist seems to be a bug. There is no need to mask a unit file which doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a145caf947`)	2018-05-15 10:21:41 +02:00
Sébastien Han	0bb7e6dd8c	iscsi: add python-rtslib repository Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `8c7c11b774`)	2018-05-15 10:21:41 +02:00
Andy McCrae	54ef0496da	Allow os_tuning_params to overwrite fs.aio-max-nr The order of fs.aio-max-nr (which is hard-coded to 1048576) means that if you set fs.aio-max-nr in os_tuning_params it will effectively be ignored for bluestore scenarios. To resolve this we should move the setting of fs.aio-max-nr above the setting of os_tuning_params, in this way the operator can define the value of fs.aio-max-nr to be something other than 1048576 if they want to. Additionally, we can make the sysctl settings happen in 1 task rather than multiple. (cherry picked from commit `08a2b58d39`)	2018-05-14 11:05:43 +02:00
Gregory Meno	b6ea36e98e	adds missing state needed to upgrade nfs-ganesha in tasks for os_family Red Hat we were missing this fixes: bz1575859 Signed-off-by: Gregory Meno <gmeno@redhat.com> (cherry picked from commit `26f6a65042`) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-10 11:49:53 -07:00
Guillaume Abrioux	259fae931d	mon: fix mgr keyring creation when upgrading from jewel On containerized deployment, when upgrading from jewel to luminous, mgr keyring creation fails because the command to create mgr keyring is executed on a container that is still running jewel since the container is restarted later to run the new image, therefore, it fails with bad entity error. To get around this situation, we can delegate the command to create these keyrings on the first monitor when we are running the playbook on the last monitor. That way we ensure we will issue the command on a container that has been well restarted with the new image. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-09 10:29:48 -07:00
Guillaume Abrioux	7b387b506a	osd: clean legacy syntax in ceph-osd-run.sh.j2 Quick clean on a legacy syntax due to `e0a264c7e` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-09 07:29:33 +02:00
Simone Caronni	b12bf62c36	Make sure the restart_mds_daemon script is created with the correct MDS name	2018-05-08 20:53:15 +02:00
Sébastien Han	07ca91b5cb	common: enable Tools repo for rhcs clients Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574458 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-08 16:12:30 +02:00
Andy McCrae	e99351b95b	Fix install of nfs-ganesha-ceph for Debian/SuSE The Debian and SuSE installs for nfs-ganesha on the non-rhcs repository requires you to allow_unauthenticated for Debian, and disable_gpg_check for SuSE. The nfs-ganesha-rgw package already does this, but the nfs-ganesha-ceph package will fail to install because of this same issue. This PR moves the installations to happen when the appropriate flags are set to True (nfs_obj_gw & nfs_file_gw), but does it per distro (one for SuSE and one for Debian) so that the appropriate flag can be passed to ignore the GPG check.	2018-05-04 15:13:59 +02:00

... 3 4 5 6 7 ...

3856 Commits (f53eca31267710c83545ea5272cc2de22c7b7597) All Branches Search

3856 Commits (f53eca31267710c83545ea5272cc2de22c7b7597)

All Branches