During a rolling update, OSDs are restarted twice currently. Once, by the
handler in roles/ceph-defaults/handlers/main.yml and a second time by tasks
in the rolling_update playbook. This change turns off restarts by the handler.
Further, the restart initiated by the rolling_update playbook is more
efficient as it restarts all the OSDs on a host as one operation and waits
for them to rejoin the cluster. The restart task in the handler restarts one
OSD at a time and waits for it to join the cluster.
The bluestore lvm osd scenario does not require a journal entry. For
this reason we need to have a separate schema for that and filestore or
notario will fail validation for the bluestore lvm scenario because the
journal key does not exist in lvm_volumes.
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit d916246bfeb927779fa920bab2e0cc736128c8a7)
A dev or rhcs install does not require ceph_stable_release to be set and
instead generates that by looking at the installed ceph-version.
However, at this point in the playbook ceph may not have been installed
yet and ceph-common has not be run.
Fixes: https://github.com/ceph/ceph-ansible/issues/2618
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
objectstore is not a valid option, it's osd_objectstore and it's already
validated in install_options
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The validation module does not get config options with the template
syntax rendered, so we're gonna remove that and just default it to
False. The backwards compat was schedule to be removed in 3.1 anyway.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
With the addition of the validate module we need to ensure
that notario is installed. This will be done with the use
of this requirments.txt file and pip.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
When devices is not defined because you want to use the 'lvm'
osd_scenario but you've made a mistake selecting that scenario these
tasks should not fail.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This needs to be in it's own play with ceph-defaults included
so that I can validate things that might be defaulted in that
role.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Extra space in systemctl list-units can cause restart_osd_daemon.sh to
fail
It looks like if you have more services enabled in the node space
between "loaded" and "active" get more space as compared to one space
given in command the command[1].
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1573317
Signed-off-by: Sébastien Han <seb@redhat.com>
Check whether a mgr module is supposed to be disabled before disabling
it and whether it is already enabled before enabling it.
Signed-off-by: Michael Vollman <michael.b.vollman@gmail.com>
A customer has been facing an issue when trying to override
`monitor_interface` in inventory host file.
In his use case, all nodes had the same interface for
`monitor_interface` name except one. Therefore, they tried to override
this variable for that node in the inventory host file but the
take-over-existing-cluster playbook was failing when trying to generate
the new ceph.conf file because of undefined variable.
Typical error:
```
fatal: [srvcto103cnodep01]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute u'ansible_bond0.15'"}
```
Including variables like this `include_vars: group_vars/all.yml` prevent
us from overriding anything in inventory host file because it
overwrites everything you would have defined in inventory.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1575915
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We can simply reference the template name since it exists within the
role that we are calling. We don't need to check the ANSIBLE_ROLE_PATH
or playbooks directory for the file.
During the transition from jewel non-container to container old ceph
units are disabled. ceph-disk can still remain in some cases and will
appear as 'loaded failed', this is not a problem although operators
might not like to see these units failing. That's why we remove them if
we find them.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1577846
Signed-off-by: Sébastien Han <seb@redhat.com>
In order to ensure there is no leftover after having purged a cluster,
we must wipe all partitions properly.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
there is some leftover on devices when purging osds because of a invalid
device list construction.
typical error:
```
changed: [osd3] => (item=/dev/sda sda1) => {
"changed": true,
"cmd": "# if the disk passed is a raw device AND the boot system disk\n if parted -s \"/dev/sda sda1\" print | grep -sq boot; then\n echo \"Looks like /dev/sda sda1 has a boot partition,\"\n echo \"if you want to delete specific partitions point to the partition instead of the raw device\"\n echo \"Do not use your system disk!\"\n exit 1\n fi\n echo sgdisk -Z \"/dev/sda sda1\"\n echo dd if=/dev/zero of=\"/dev/sda sda1\" bs=1M count=200\n echo udevadm settle --timeout=600",
"delta": "0:00:00.015188",
"end": "2018-05-16 12:41:40.408597",
"item": "/dev/sda sda1",
"rc": 0,
"start": "2018-05-16 12:41:40.393409"
}
STDOUT:
sgdisk -Z /dev/sda sda1
dd if=/dev/zero of=/dev/sda sda1 bs=1M count=200
udevadm settle --timeout=600
STDERR:
Error: Could not stat device /dev/sda sda1 - No such file or directory.
```
the devices list in the task `resolve parent device` isn't built
properly because the command used to resolve the parent device doesn't
return the expected output
eg:
```
changed: [osd3] => (item=/dev/sda1) => {
"changed": true,
"cmd": "echo /dev/$(lsblk -no pkname \"/dev/sda1\")",
"delta": "0:00:00.013634",
"end": "2018-05-16 12:41:09.068166",
"item": "/dev/sda1",
"rc": 0,
"start": "2018-05-16 12:41:09.054532"
}
STDOUT:
/dev/sda sda1
```
For instance, it will result with a devices list like:
`['/dev/sda sda1', '/dev/sdb', '/dev/sdc sdc1']`
where we expect to have:
`['/dev/sda', '/dev/sdb', '/dev/sdc']`
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>