On containerized deployment, the OSD entrypoint runs some ceph-volume
commands (lvm/simple scan and/or activate) which perform badly without
the ulimit option.
This option was added for all previous ceph-volume commands but not on
the ceph-osd container startup.
Also updating hard limit value to 4096 to reflect default baremetal
value.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
just like `ceph_osd_pool_default_size`, a pool size might change after an
initial deployment. Having this condition prevents from customizing the
pool in that case.
This is not needed so let's remove it.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
let's use `until` instead of doing test in bash using python oneliner
also, use `command` instead of `shell`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
the data structure has changed in octopus.
eg: the path to `num_osds` is now `["osdmap"]["num_osds"]`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When creating OpenStack pools, we only check if the return code from
the pool list command isn't 0 (ie: if it doesn't exist). In that case,
the return code will be 2. That's why the next condition is rc != 0 for
the pool creation.
But in containerized deployment, the return code could be different if
there's a failure on the container engine command (like container not
running). In that case, the return code could but either 1 (docker) or
125 (podman) so we should fail at this point and not in the next tasks.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This environment variable was added in cb381b4 but was removed in
4d35e9e.
This commit reintroduces the change.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The ceph-volume lvm list command takes ages to complete when having
a lot of LV devices on containerized deployment.
For instance, with 25 OSDs on a node it takes 3 mins 44s to list the
OSD.
Adding the max open files limit to the container engine cli when
executing the ceph-volume command seems to improve a lot thee
execution time ~30s.
This was impacting the OSDs creation with ceph-volume (both filestore
and bluestore) when using multiple LV devices.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We already set the become flag to true at a play level in the site*
playbooks so we don't need to set it at a task level.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
`parted_results` isn't used anymore in the playbook.
By the way, `parted` seems to cause issue because it changes the
ownership on devices:
```
root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:53 /dev/sdc
brw-rw----. 1 ceph ceph 8, 33 Jun 11 08:53 /dev/sdc1
brw-rw----. 1 ceph ceph 8, 34 Jun 11 08:53 /dev/sdc2
[root@osd0 ~]# parted -s /dev/sdc print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdc: 53.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 1075MB 1074MB ceph block.db
2 1075MB 2149MB 1074MB ceph block.db
[root@osd0 ~]# #We can see ownerships have changed from ceph:ceph to root:disk:
[root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:57 /dev/sdc
brw-rw----. 1 root disk 8, 33 Jun 11 08:57 /dev/sdc1
brw-rw----. 1 root disk 8, 34 Jun 11 08:57 /dev/sdc2
[root@osd0 ~]#
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When using podman, the systemd unit scripts don't have a dependency
on the network. So we're not sure that the network is up and running
when the containers are starting.
With docker this behaviour is already handled because the systemd
unit scripts depend on docker service which is started after the
network.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```
Now appended ``| bool`` on a lot of the affected variables.
Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.
Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
Otherwise content in /run/udev is mislabeled and prevent some services
like NetworkManager from starting.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".
Signed-off-by: Rishabh Dave <ridave@redhat.com>
In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When performing a rolling update do not try to create
any new osds with `ceph-volume lvm batch`. This is troublesome
because when upgrading to nautilus the devices list might contain
devices that are currently being used by ceph-disk and have GPT
headers on them, which will cause ceph-volume to fail when
trying to use such a device. Any devices originally created
by ceph-disk will need to be removed from the devices list
before any new osds can be created.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This variable was related to ceph-disk scenarios.
Since we are entirely dropping ceph-disk support as of stable-4.0, let's
remove this variable.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
osd_scenario has become obsolete and defaults to lvm. With lvm there is
no such things has collocated and non-collocated.
Signed-off-by: Sébastien Han <seb@redhat.com>
We don't support the preparation of OSD with ceph-disk. ceph-volume is
only supported. However, the start operation of OSD is still supported.
So let's say you change a config option, the handlers will be able to
restart all the OSDs via their respective systemd unit files.
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Since https://github.com/ceph/ceph/commit/77912c0 ceph-volume uses
stdout encoding based on LC_CTYPE and PYTHONIOENCODING environment
variables.
Thoses variables aren't set when using ansible.
Currently this commit breaks non containerized deployment on Ubuntu.
TASK [use ceph-volume to create bluestore osds] ********************
cmd:
- ceph-volume
- --cluster
- ceph
- lvm
- create
- --bluestore
- --data
- /dev/sdb
rc: 1
stderr: |-
Traceback (most recent call last):
(...)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in
position 132: ordinal not in range(128)
Note that the task is failing on ansible side due to the stdout
decoding but the osd creation is successful.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This prevents the packaging from restarting services before we do need
to restart them in the rolling update sequence.
We want to handle services restart at rolling_update playbook.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When using osd_scenario lvm, we never check if the lvm2 package is
present on the host.
When using containerized deployment and docker on CentOS/RedHat this
package will be automatically installed as a dependency but not for
Ubuntu distribution.
OSD deployed via ceph-volume require the lvmetad.socket to be active
and running.
Resolves: #3728
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since all files in container image have moved to `/opt/ceph-container`
this check must look for new AND the old path so it's backward
compatible. Otherwise it could end up by templating an inconsistent
`ceph-osd-run.sh`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
With 3e32dce we can run OSD containers with numactl support.
When using numactl command in a containerized deployment we need to
be sure that the corresponding package is installed on the host.
The package installation is only executed when the
ceph_osd_numactl_opts variable isn't empty.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We don't need to set After=docker.service when the container_binary
variable isn't set to docker.
It doesn't break anything currently but it could be confusing when
using podman.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
After b8d580b and e9e5d5a we could have either item.min_size or
osd_pool_default_min_size using string instead of int causing the
condition to be true when it's false.
As a result, the task could try to set the pool min_size value to
0 which leads to:
Error EINVAL: pool min_size must be between 1 and 1
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
b8d580b3f4 introduced a bug when
`min_size` isn't set (default to 0).
Typical error:
```
Error EINVAL: pool min_size must be between 1 and 1
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The following lint issues have been resolved:
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2
[305] Use shell only when shell functionality is required
/home/travis/build/ceph/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:47
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:2
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:7
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:14
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:19
[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:24
Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
The "get osd ids" statement only registers the osd_ids_non_container variable. Running "ls /var/lib/ceph/osd/ | sed 's/.*-//'" should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible.
Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.
The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).
This is related to Bugzilla 1578086.
In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.
Signed-off-by: David Waiting <david_waiting@comcast.com>
This reverts commit bb2bbeb941.
Looks like when not passing `--pid=host` we are facing some issues when
deploying more than 2 OSDs in containerized environment.
At the moment, we are still troubleshooting this issue but we prefer to
revert this commit so it doesn't block any PR in the CI.
As soon as we have a fix; we will push a new PR to remove `--pid=host`
(a revert of revert...)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
instead of using `RuntimeDirectory` parameter in systemd unit files,
let's use a systemd `tmpfiles.d` to ensure `/run/ceph`.
Explanation:
`podman` doesn't create the `/var/run/ceph` if it doesn't exist the time
where the container is run while `docker` used to create it.
In case of `switch_to_containers` scenario, `/run/ceph` gets created by
a tmpfiles.d systemd file; when switching to containers, the systemd
unit file complains because `/run/ceph` already exists
The better fix would be to ensure `/usr/lib/tmpfiles.d/ceph-common.conf`
is removed and only rely on `RuntimeDirectory` from systemd unit file parameter
but we come from a non-containerized environment which is already running,
it means `/run/ceph` is already created and when starting the unit to
start the container, systemd will still complain and we can't simply
remove the directory if daemons are collocated.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
/var/run/ceph resides in a non persistent filesystem (tmpfs)
After a reboot, all daemons won't start because this directory will be
missing.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This task used to live in ceph-osd, but we need it defined here to that
ceph-config can use it when trying to determine the number of osds.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.
Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 452069cb3a)
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.
Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.
Signed-off-by: Sébastien Han <seb@redhat.com>
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit unifies the container and non-container code, which in the
meantime gives use the ability to deploy N mon container at the same
time without having to serialized the deployment. This will drastically
reduces the time needed to bootstrap the cluster.
Note, this is only possible since Nautilus because the monitors are
bootstrap the initial keys on their own once they reach quorum. In the
Nautilus version of the ceph-container mon, we stopped generating the
keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238
Signed-off-by: Sébastien Han <seb@redhat.com>
This is false, `./defaults/main.yml` is not supposed to be modified
directly. groups_vars a/o host_vars should always be preferred.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.
Signed-off-by: Sébastien Han <seb@redhat.com>
Add real default value for osd pool size customization.
Ceph itself has an `osd_pool_default_size` default value to `3`.
If users don't specify a pool size in various pools definition within
ceph-ansible, we should default to `3`.
By the way, this kind of condition isn't really clear:
```
when:
- rbd_pool_size | default ("")
```
we should try to get the customized value then default to what is in
`osd_pool_default_size` (which has its default value pointing to
`ceph_osd_pool_default_size` (`3`) as well) and compare it to
`ceph_osd_pool_default_size`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If you use python3 based ansible then keys() returns a dict_keys object,
not a list of keys. This breaks the installation on such a system. Using
the list filter provides a more robust solution that should work on both
python2 and python3 based ansible. You can find some more information
about the issue, here:
https://github.com/ansible/ansible/issues/19514
Signed-off-by: Boris Ranto <branto@redhat.com>
since `ceph-volume` introduction, there is no need to split those tasks.
Let's refact this part of the code so it's clearer.
By the way, this was breaking rolling_update.yml when `openstack_config:
true` playbook because nothing ensured OSDs were started in ceph-osd role (In
`openstack_config.yml` there is a check ensuring all OSD are UP which was
obviously failing) and resulted with OSDs on the last OSD node not started
anyway.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
description = 'Use `when: var` rather than `when: var != ""` (or ' \ 'conversely `when: not var` rather than `when: var == ""`)'
Signed-off-by: Sébastien Han <seb@redhat.com>
Update the meta with the relavant support such as:
* ansible version: min 2.4
* distro supported (tested on) centos 7
Signed-off-by: Sébastien Han <seb@redhat.com>
Calling command should have changed_when false otherwise each time it
runs it will show as 'changed' and this is irrelevant.
Commands should not change things if nothing needs doing
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we do not have enough data to put valid upper bounds for the memory
usage of these daemons, do not put artificial limits by default. This will
help us avoid failures like OOM kills due to low default values.
Whenever required, these limits can be manually enforced by the user.
More details in
https://bugzilla.redhat.com/show_bug.cgi?id=1638148
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1638148
Signed-off-by: Neha Ojha <nojha@redhat.com>
The playbook has various improvements:
* run ceph-validate role before doing anything
* run ceph-fetch-keys only on the first monitor of the inventory list
* set noup flag so PGs get distributed once all the new OSDs have been
added to the cluster and unset it when they are up and running
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1624962
Signed-off-by: Sébastien Han <seb@redhat.com>
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit does a couple of things:
* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module
Signed-off-by: Sébastien Han <seb@redhat.com>
This task was created for ceph-disk based deployments so it's not needed
when osd are prepared with ceph-volume.
Signed-off-by: Sébastien Han <seb@redhat.com>
We don't need to pass the device and discover the OSD ID. We have a
task that gathers all the OSD ID present on that machine, so we simply
re-use them and activate them. This also handles the situation when you
have multiple OSDs running on the same device.
Signed-off-by: Sébastien Han <seb@redhat.com>
We don't need to pass the hostname on the container name but we can keep
it simple and just call it ceph-osd-$id.
Signed-off-by: Sébastien Han <seb@redhat.com>
expose_partitions is only needed on ceph-disk OSDs so we don't need to
activate this code when running lvm prepared OSDs.
Signed-off-by: Sébastien Han <seb@redhat.com>
The batch option got recently added, while rebasing this patch it was
necessary to implement it. So now, the batch option can work on
containerized environments.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630977
Signed-off-by: Sébastien Han <seb@redhat.com>
Fixes the deprecation warning:
[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
using `result|search` use `result is search`.
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
Instead used "import_tasks" and "include_tasks" to tell whether tasks
must be included statically or dynamically.
Fixes: https://github.com/ceph/ceph-ansible/issues/2998
Signed-off-by: Rishabh Dave <ridave@redhat.com>
If this is set to anything other than the default value of 1 then the
--osds-per-device flag will be used by the batch command to define how
many osds will be created per device.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This reverts commit e84f11e99e.
This commit was giving a new failure later during the rolling_update
process. Basically, this was modifying the list of devices and started
impacting the ceph-osd itself. The modification to accomodate the
osd_auto_discovery parameter should happen outside of the ceph-osd.
Also we are trying to not play ceph-osd role during the rolling_update
process so we can speed up the upgrade.
Signed-off-by: Sébastien Han <seb@redhat.com>
rolling_update relies on the list of devices when performing the restart
of the OSDs. The task that is builind the devices list out of the
ansible_devices dict only runs when there are no partitions on the
drives. However during an upgrade the OSD are already configured, they
have been prepared and have partitions so this task won't run and thus
the devices list will be empty, skipping the restart during
rolling_update. We now run the same task under different requirements
when rolling_update is true and build a list when:
* osd_auto_discovery is true
* rolling_update is true
* ansible_devices exists
* no dm/lv are part of the discovery
* the device is not removable
* the device has more than 1 sector
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613626
Signed-off-by: Sébastien Han <seb@redhat.com>
This is used with the lvm osd scenario. When using devices you need the
option to set the crush device class for all of the OSDs that are
created from those devices.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This adds the action 'batch' to the ceph-volume module so that we can
run the new 'ceph-volume lvm batch' subcommand. A functional test is
also included.
If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm
batch' command will be used to create the OSDs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The container runs with --rm which means it will be deleted by Docker
when exiting. Also 'docker rm -f' is not idempotent and returns 1 if the
container does not exist.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1609007
Signed-off-by: Sébastien Han <seb@redhat.com>
If we want to be backward compatible with release prior to luminous, we
have to set the rule name accordingly to default values used in jewel.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The script ceph-osd-run.sh holds the config options to start the
container, if one of these options are modified we must restart the
container. This was not the case before becauase the 'notify' flag
wasn't present.
Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1596061
Signed-off-by: Sébastien Han <seb@redhat.com>
keyring files in /etc/ceph. Default value is the same as it was (0600),
but this variable allows user to override it (f.e. set it to 0640).
Signed-off-by: George Shuklin <george.shuklin@gmail.com>
As discussed with the cores, the current limits are too low and should
be bumped to higher value.
So now by default monitors get 3GB and OSDs get 5GB.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1591876
Signed-off-by: Sébastien Han <seb@redhat.com>
When configuring openstack, the created keyrings aren't copied over to
all monitors nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since the openstack_config.yml has been moved to `ceph-osd` we must move
this `set_fact` in ceph-osd otherwise the tasks in
`openstack_config.yml` using `openstack_keys` will actually use the
defaults value from `ceph-defaults`.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1585139
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This is a follow up on #2628.
Even with the openstack pools creation moved later in the playbook,
there is still an issue because OSDs are not all UP when trying to
create pools.
Adding a task which checks for all OSDs to be UP with a `retries/until`
condition should definitively fix this issue.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
in `ceph-osd` there is no need to set `docker_exec_cmd` since the only
place where this fact is used is in `openstack_config.yml` which
delegate all docker command to a monitor node. It means we need the
`docker_exec_cmd` fact that has been set referring to `ceph-mon-*`
containers, this fact is already set earlier in `ceph-defaults`.
By the way, when collocating an OSD with a MON it fails because the container
`ceph-osd-{{ ansible_hostname }}` doesn't exist.
Removing this task will allow to collocate an OSD with a MON.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1584179
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.
The idea here is to move cephfs pools creation in `ceph-mds` role.
[1] e59258943b/src/mon/OSDMonitor.cc (L5673)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.
The idea here is to move openstack pools creation at the end of `ceph-osd` role.
[1] e59258943b/src/mon/OSDMonitor.cc (L5673)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The LVM lvcreate fails if the disk already has a GPT header.
We create GPT header regardless of OSD scenario. The fix is to
skip header creation for lvm scenario.
fixes: https://github.com/ceph/ceph-ansible/issues/2592
Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>
The order of fs.aio-max-nr (which is hard-coded to 1048576) means that
if you set fs.aio-max-nr in os_tuning_params it will effectively be
ignored for bluestore scenarios.
To resolve this we should move the setting of fs.aio-max-nr above the
setting of os_tuning_params, in this way the operator can define the
value of fs.aio-max-nr to be something other than 1048576 if they want
to.
Additionally, we can make the sysctl settings happen in 1 task rather
than multiple.
Useful for softwares that do data collection/monitoring like collectd.
They can connect to the socket and then retrieve information.
Even though the sockets are exposed now, I'm keeping the docker exec to
check the socket, this will allow newer version of ceph-ansible to work
with older versions.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280
Signed-off-by: Sébastien Han <seb@redhat.com>
We know bindmount with the :z option at the end of the -v command so
this will basically run the exact same command as we used to run. So to
speak:
chcon -Rt svirt_sandbox_file_t /var/lib/ceph
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit does a couple of things:
* use a common.yml file that contains things that can be played on both
container and non-container
* refactor the ability to copy the admin key to the nodes
Signed-off-by: Sébastien Han <seb@redhat.com>
Regardless if the partition is 'ceph' or something else, we don't want
to be as strick as checking for a particular partition.
If the drive has a partition, we just don't do anything.
This solves the case where the server reboots, disks get a different
/dev/sda (node) allocation. In this case, prior to restarting the server
/dev/sda was an OSD, but now it's /dev/sdb and the other way around.
In such scenario, we will try to prepare the OSD and create a new
partition, so let's not mess around with devices that have partitions.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303
Signed-off-by: Sébastien Han <seb@redhat.com>
This was causing a lot of pain with the handlers. Also the
implementation was not ideal since we were assembling files. Everything
can now be done with the ceph_crush module so let's remove that.
Signed-off-by: Sébastien Han <seb@redhat.com>
Multipath disks have partitions with a different format than what
ceph-ansible currently supports, this update makes ceph-ansible
aware of that format so multipath disks can be used as OSDs
Signed-off-by: Caleb Boylan <caleb.boylan@ormuco.com>
This commit fixes a bug that occurs especially for dmcrypt scenarios.
There is an issue where the 'disk_list' container can't reach the ceph
cluster because it's not launched with `--net=host`.
If this container can't reach the cluster, it will hang on this step
(when trying to retrieve the dm-crypt key) :
```
+common_functions.sh:448: open_encrypted_part(): ceph --cluster abc12 --name \
client.osd-lockbox.9138767f-7445-49e0-baad-35e19adca8bb --keyring \
/var/lib/ceph/osd-lockbox/9138767f-7445-49e0-baad-35e19adca8bb/keyring \
config-key get dm-crypt/osd/9138767f-7445-49e0-baad-35e19adca8bb/luks
+common_functions.sh:452: open_encrypted_part(): base64 -d
+common_functions.sh:452: open_encrypted_part(): cryptsetup --key-file \
-luksOpen /dev/sdb1 9138767f-7445-49e0-baad-35e19adca8bb
```
It means the `ceph-run-osd.sh` script won't be able to start the
`osd_disk_activate` process in ceph-container because he won't have
filled the `$DOCKER_ENV` environment variable properly.
Adding `--net=host` to the 'disk_list' container fixes this issue.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1543284
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Use a nicer syntax for `local_action` tasks.
We used to have oneliner like this:
```
local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }}
```
The usual syntax:
```
local_action:
module: wait_for
port: 22
host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
state: started
delay: 10
timeout: 500
```
is nicer and kind of way to keep consistency regarding the whole
playbook.
This also fix a potential issue about missing quotation :
```
Traceback (most recent call last):
File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module>
main()
File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main
rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin)
File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command
File "/usr/lib64/python2.7/shlex.py", line 279, in split
return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next
token = self.get_token()
File "/usr/lib64/python2.7/shlex.py", line 96, in get_token
raw = self.read_token()
File "/usr/lib64/python2.7/shlex.py", line 172, in read_token
raise ValueError, "No closing quotation"
ValueError: No closing quotation
```
writing `local_action: shell echo {{ fsid }} | tee {{ fetch_directory }}/ceph_cluster_uuid.conf`
can cause trouble because it's complaining with missing quotes, this fix solves this issue.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Description of problem: The 'get osd id' task goes through all the 10 times (and its respective timeouts) to make sure that the number of OSDs in the osd directory match the number of devices.
This happens always, regardless if the setup and deployment is correct.
Version-Release number of selected component (if applicable): Surely the latest. But any ceph-ansible version that contains ceph-volume support is affected.
How reproducible: 100%
Steps to Reproduce:
1. Use ceph-volume (LVM) to deploy OSDs
2. Avoid using anything in the 'devices' section
3. Deploy the cluster
Actual results:
TASK [ceph-osd : get osd id _uses_shell=True, _raw_params=ls /var/lib/ceph/osd/ | sed 's/.*-//'] **********************************************************************************************************************************************
task path: /Users/alfredo/python/upstream/ceph/src/ceph-volume/ceph_volume/tests/functional/lvm/.tox/xenial-filestore-dmcrypt/tmp/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:6
FAILED - RETRYING: get osd id (10 retries left).
FAILED - RETRYING: get osd id (9 retries left).
FAILED - RETRYING: get osd id (8 retries left).
FAILED - RETRYING: get osd id (7 retries left).
FAILED - RETRYING: get osd id (6 retries left).
FAILED - RETRYING: get osd id (5 retries left).
FAILED - RETRYING: get osd id (4 retries left).
FAILED - RETRYING: get osd id (3 retries left).
FAILED - RETRYING: get osd id (2 retries left).
FAILED - RETRYING: get osd id (1 retries left).
ok: [osd0] => {
"attempts": 10,
"changed": false,
"cmd": "ls /var/lib/ceph/osd/ | sed 's/.*-//'",
"delta": "0:00:00.002717",
"end": "2018-01-21 18:10:31.237933",
"failed": true,
"failed_when_result": false,
"rc": 0,
"start": "2018-01-21 18:10:31.235216"
}
STDOUT:
0
1
2
Expected results:
There aren't any (or just a few) timeouts while the OSDs are found
Additional info:
This is happening because the check is mapping the number of "devices" defined for ceph-disk (in this case it would be 0) to match the number of OSDs found.
Basically this line:
until: osd_id.stdout_lines|length == devices|unique|length
Means in this 2 OSD case it is trying to ensure the following incorrect condition:
until: 2 == 0
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537103
This is to keep backward compatibility with stable-2.2 and satisfy the
check "verify dedicated devices have been provided" in
`check_mandatory_vars.yml`. This check is looking for
`dedicated_devices` so we need to default it's value to
`raw_journal_devices` when `raw_multi_journal` is set to `True`.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1536098
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
On a non-collocated scenario, if a drive is faulty we can't really
remove it from the list of 'devices' without messing up or having to
re-arrange the order of the 'dedicated_devices'. We want to keep this
device list ordered. This will prevent the activation failing on a
device that we know is failing but we can't remove it yet to not mess up
the dedicated_devices mapping with devices.
Signed-off-by: Sébastien Han <seb@redhat.com>
the gpt label creation doesn't work even with parted module.
This commit fixes the gpt label creation by using parted command
instead.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>