Problem: fail to deploy a containerized Ceph cluster with ipv6
Solution: do not hardcode ipv4 when bootstrapping the container.
Now use ip_version: ipv6 to get a containerized cluster deployed with
ipv6.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1451786
Signed-off-by: Sébastien Han <seb@redhat.com>
In addition to `196fa7e` this commit check if a container has been
already launched and delete it before retrying the ceph osd prepare
process.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The CI on Docker is reporting the following error:
STDERR:
Error EINVAL: bad entity name
This is due to the fact that this auth entity name does not exist on
Jewel so we should not create that key when running Jewel containers.
Fixes: https://github.com/ceph/ceph-ansible/issues/1514
Signed-off-by: Sébastien Han <seb@redhat.com>
Already documented in the Red Hat Ceph Storage 2 Installation Guide
for Red Hat Enterprise Linux, but not here
Signed-off-by: Florian Klink <flokli@flokli.de>
"rgw override bucket index max shards" and
"rgw bucket default quota max objects" were in the
client section of the ceph.conf and not being
applied, this commit moves them to global
Resolves: bz#1391500
Signed-off-by: Ali Maredia <amaredia@redhat.com>
We shouldn't need this anymore as the upgrade bug that
debian_ceph_packages was used to workaround should have
been fixed as of jewel.
See https://github.com/ceph/ceph-ansible/issues/1481 for more
detailed information.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Change civetweb_num_thread default to 100
Add capability to override number of pgs for
rgw pools.
Add ceph.conf vars to enable default bucket
object quota at users choosing into the ceph.conf.j2
template
Resolves: rhbz#1437173
Resolves: rhbz#1391500
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Restore the check_socket that was removed by `5bec62b`.
This commit also improves the logging in `restart_*_daemon.sh` scripts
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Prior to this change, ansible was only checking for the existence of the
package, now if upgrade_ceph_packages is true this means we are
performing an upgrade.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1442016
Signed-off-by: Sébastien Han <seb@redhat.com>
Proof-of-concept clusters or actual production clusters will never want to use this. We also do not test it anywhere for this same reason.
Signed-off-by: Gregory Meno <gmeno@redhat.com>
This is to allow ceph-mgr daemons to remote control
osd and mds daemons with MCommand messages.
Fixes: http://tracker.ceph.com/issues/19713
Signed-off-by: John Spray <john.spray@redhat.com>
Without this, we don't test the mgr role so we need to add it.
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
This is the same fix as bc846b7da6
applied to the other part of the code-base that builds ceph.conf (I'd
missed that 349b9ab3e7 had duplicated
this code).
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
Ansible's assemble module by default will put all files in the src
directory together into dest. We only want to put {{ cluster }}.conf
and osd.conf together, not anything that might have found its way into
/etc/ceph/ceph.d (e.g. files left by the sysadmin taking backups
before an ansible run). So specify a regexp that matches only those
two files.
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
Ansible evaluates the 'with_items' before the 'when' so if the inventory
does not have the group declared it'll fail. To fix this, we set an
empty array to make the with_items happy and then evaluate with the
'when'.
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this change we were deploying a monitor using tis fqdn name but
we were checking its state and performing actions on it using its
shortname.
Signed-off-by: Sébastien Han <seb@redhat.com>
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.
Only works as of the Kraken release.
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-create-keys unit file was removed here:
* 8bcb4646b6
* dc5fe8d415
As a consequence the systemctl preset command now fails to run since the
unit does not exist anymore. Due to the redirection in /dev/null we
don't know what's happening.
Ultimately the mon unit doesn't get enabled and the mon service won't
start after reboot.
Removing the old/non-existent unit makes the command succeed now.
ceph fix: https://github.com/ceph/ceph/pull/14226
Signed-off-by: WingkaiHo <sanguosfiang@163.com>
Co-Authored-By: Sébastien Han <seb@redhat.com>
Until now, only the first task were executed.
The idea here is to use `listen` statement to be able to notify multiple
handler and regroup all of them in `./handlers/main.yml` as notifying an
included handler task is not possible.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
As reported in
https://github.com/ceph/ceph-ansible/issues/1403 when devices are held
by lvm and `osd_auto_discovery` is set to true, it's not enough to check
for a partition count = 0 since Ansible does not report.
This patch also looks for 'holders' which in a case of lvm corresponds
to the name of the pv. Now we also look for holders = 0.
Fixes: #1403
Signed-off-by: Sébastien Han <seb@redhat.com>
Problem: too many different commands to do the same thing. The 'cut'
command on infrastructure-playbooks/purge-cluster.yml was also wrong.
This sed command from osixia in ceph-docker
https://github.com/ceph/ceph-docker/pull/580/ addresses all the
scenarios.
Signed-off-by: Sébastien Han <seb@redhat.com>
ntp is still installed even if ntp_service_enabled is set to false.
That could be a problem if the time synchronization is managed by
something else than ceph-ansible or if you want to use different NTP
implementation as suggested in #1354.
Fixes: #1354
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Guits <gabrioux@redhat.com>
If a group of hosts is empty, (for instance 'mdss', in case of a
deployment without any mds node), the playbook will fails when trying
to restart service with `"'dict object' has no attribute u'XXX'"` error.
The idea here is to force the `with_items` statements in all included handler tasks
to get at least an empty array.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
systctl tuning should be in the sysctl.d directory. This creates
a seperation from what values were set specific to ceph, and what
values were set by the operator.
Signed-off-by: Tyler Brekke <tbrekke@redhat.com>
With ' in osd_crush_location, systemd will show this error:
ceph-osd-prestart.sh[2931]: Invalid command: invalid chars ' in 'root=
Signed-off-by: Christian Zunker <christian.zunker@codecentric.de>
This fixes issue #1299. According to @ktdreyer s comment in the ticket,
he fixed the web server config so also older (non-SNI) python clients
can use the uri module here.
After the jewel release the mon startup does not generate keys, but it's
still harmless to call ceph-create-keys with jewel because this task has
a 'creates' argument that will cause it not to run if the keys already
exist.
Removing this when condition also allows the downstream CI tests to
install kraken or luminous without resetting ceph_stable_release, which does not
pertain to rhcs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This is not only for monitors, but also mds, rgw and rbd mirror so
making the var name more generic:
ceph_docker_enable_centos_extra_repo
Signed-off-by: Sébastien Han <seb@redhat.com>
Sometimes the socket appears during the 5th attempt and sometimes not so
increasing the timeout a little bit.
Signed-off-by: Sébastien Han <seb@redhat.com>
Add the possibility to create openstack pools and keys even for containerized deployments
Fix: #1321
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This removes the implicit order requirement when using OSD fragments.
When you use OSD fragments and ceph-osd role is not the last one,
the fragments get removed from ceph.conf by ceph-common.
It is not nice to have this code at two locations, but this is
necessary to prevent problems, when ceph-osd is the last role as
ceph-common gets executed before ceph-osd.
This could be prevented when ceph-common would be explicitly called
at the end of the playbook.
Signed-off-by: Christian Zunker <christian.zunker@codecentric.de>
This option was missing for rrgw, mds, rbd mirror and nfs making these
daemon impossible to run on a kv deployment with containers.
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this change, ceph-ansible would install the main NFS Ganesha
server daemon on Ubuntu, but it would skip the Ceph FSALs.
Running "apt-get install nfs-ganesha" will only install the main NFS Ganesha
server. It does *not* pull in the RGW FSAL
(/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so)
Running "apt-get install nfs-ganesha-fsal" will install the RGW FSAL as
well as the main NFS Ganesha server package.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
This patch introduces calamari_debug option which will turn on debugging
for calamari before initializing and running it.
Signed-off-by: Boris Ranto <branto@redhat.com>
From Josh Durgin, "I'd recommend not setting vfs_cache_pressure in
ceph-ansible. The syncfs issue is still there, and has caused real
problems in the past, whereas there hasn't been good data showing lower
vfs_cache_pressure is very helpful - the only cases I'm aware of have
shown it makes little difference to performance."
https://bugzilla.redhat.com/show_bug.cgi?id=1395451
Install package from official repos rather than pip when using RHEL.
This commit fix https://bugzilla.redhat.com/show_bug.cgi?id=1420855
Also this commit Refact all `roles/ceph-*/tasks/docker/pre_requisite.yml`
to avoid a lot of duplicated code.
Fix: #1303
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This was needed for Hammer and older version, not needed anymore since
we have a 'ceph' user to run ceph processes.
Signed-off-by: Sébastien Han <seb@redhat.com>
Check if ceph filesystem already exists before creating it.
If the ceph filesystem doesn't exist, execute the task only on one node.
Fix: #1314
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since distro will not allow /usr/share to be writable (e.g: atomic) so
we let the operator decide where to put that script.
Signed-off-by: Sébastien Han <seb@redhat.com>
Oh yeah! This patch adds more fine grained control on how we run the
activation osd container. We now use --device to give a read, write and
mknodaccess to a specific device to be consumed by Ceph. We also use
SYS_ADMIN cap to allow mount operations, ceph-disk needs to temporary
mount the osd data directory during the activation sequence.
This patch also enables the support of dedicated journal devices when
deploying ceph-docker with ceph-ansible.
Depends on https://github.com/ceph/ceph-docker/pull/478
Signed-off-by: Sébastien Han <seb@redhat.com>