Commit Graph

2355 Commits (c7950d5539fd90a1e6613213c29ccac12501320d)

Author SHA1 Message Date
Dimitri Savineau bcafb182c4 common: install dependencies for apt modules
When using a minimal Debian/Ubuntu distribution there's no
ca-certificates and gpg packages installed so the apt modules will
fail:

Failed to find required executable gpg in paths:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

apt.cache.FetchFailedException:
W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease:
No system certificates available. Try installing ca-certificates.

Resolves: #3994

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 494746b7a6)
2019-05-20 10:45:46 +02:00
Guillaume Abrioux 1e2f8cd909 dashboard: move defaults variables to ceph-defaults
There is no need to have default values for these variables in each roles
since there is no corresponding host groups

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f0d4d6847)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux e29fd842a6 rename docker_exec_cmd variable
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e74d80e72f)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux aa80895d19 dashboard: align the way containers are managed
This commit aligns the way the different containers are managed with how
it's currently done with the other ceph daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cc285c417a)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux 567c6ceb43 dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool
make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to
be used as it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cd5f3fca64)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux c38c72d914 dashboard: remove legacy file
this file seems to be no longer used, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8bbcc46ae4)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux 79ad697af7 dashboard: set less permissive permissions on dashboard certificate/key
use `0440` instead of `0644` is enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 14f381200d)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux c45906e0ac dashboard: simplify config-key command
since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4405f50c85)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux fe5bcc2f9f dashboard: do not call ceph-container-common from other role
use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cdff0da7d4)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux c48c3776be dashboard: use existing variable to detect containerized deployment
there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 742bb6214c)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux 4702194d6e facts: set container_binary fact in non-containerized deployment
This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9dbb1d39)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux 997d179b7c dashboard: rename template files
add .j2 to all templates file related to dashboard roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3578d576a4)
2019-05-17 16:05:58 +02:00
Boris Ranto db3f0088fc dashboard: Support podman
This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit b4d1c3693b)
2019-05-17 16:05:58 +02:00
Boris Ranto 5a85be9502 dashboard: Set ssl_server_port if it is supported
We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit e737a1f83e)
2019-05-17 16:05:58 +02:00
Boris Ranto fda901fff9 dashboard: Add and copy alerting rules
This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 8f77caa932)
2019-05-17 16:05:58 +02:00
Boris Ranto 5ac7559736 Merge cephmetrics/dashboard-ansible repo
This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f141a6e80)
2019-05-17 16:05:58 +02:00
Dimitri Savineau bd33bcef2b container-common: allow podman for other distros
Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d2ad191eca)
2019-05-13 10:36:22 -04:00
Bruceforce f34c1dcd9d ceph-nfs: fixed with_items
If we do this in one line we get the error described in #3968

fixes #3968

Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit c3b0ee30a1)
2019-05-13 10:36:12 -04:00
Dimitri Savineau 6a48ff8a37 Update RHCS version with Nautilus
RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ba49225eab)
2019-05-13 16:23:24 +02:00
Bruceforce a007be17b7 ceph-nfs: fixed condition for "stable repos specific tasks"
The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"

now it is
"when": [
          "nfs_ganesha_stable",
          "ceph_repository == 'community'"
        ]

Please backport to stable-4.0

Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit 29f2c953b4)
2019-05-13 11:05:40 +02:00
Kevin Coakley e1b5b20111 Set the rgw_create_pools pools application to rgw
Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
(cherry picked from commit 381c58ca3e)
2019-05-13 11:05:14 +02:00
Rishabh Dave 8959ed50a5 ceph-mds: group similar tasks in create_mds_filesystem.yml
Group similar tasks together using block keyword.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 1a4dccdbb9)
2019-05-10 15:54:40 +02:00
Rishabh Dave 238a2696a6 ceph-rbd-mirror: refactor tasks/main.yml
Use blocks for similar tasks in main.yml. And move when keywords before
block keywords.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 121b5e4184)
2019-05-10 15:54:16 +02:00
Guillaume Abrioux cc6127d669 facts: fix external cluster bug
running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 936c6fca78)
2019-05-09 08:30:33 +02:00
Rishabh Dave 9e6b2e3bc5 don't access other node's docker_exec_cmd variable
Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 89748d579a)
2019-05-07 17:56:30 +02:00
Rishabh Dave df95900913 ceph-mgr: create keys for MGRs
Add code in ceph-mgr for creating a keyring for manager in so that
managers can be deployed on a separate node too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 56bfec7c58)
2019-05-07 15:12:29 +02:00
Gaudenz Steinlin 29650e71d8 Fix check mode support
Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
(cherry picked from commit 3c8987c7a5)
2019-05-07 13:07:45 +02:00
Rishabh Dave 06b3ab2a6b improve coding style
Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c80)

Conflicts:
	roles/ceph-mon/tasks/ceph_keys.yml
	roles/ceph-validate/tasks/check_devices.yml
2019-05-06 15:09:06 +00:00
Dimitri Savineau 4752327340 ansible: remove private and static attribute
This will be removed in ansible 2.8 and breaks the playbook execution
with this release.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ae266c6f2b)
2019-05-02 20:21:26 -04:00
Dimitri Savineau 2eb7642ad3 ceph-mds: Increase cpu limit to 4
In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1999cf3d19)
2019-04-30 12:12:01 -04:00
Dimitri Savineau d8688e0eb9 ceph-osd: Increase cpu limit to 4
In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c17106874c)
2019-04-30 12:11:42 -04:00
Dimitri Savineau e29a8a1f31 ceph-iscsi: start tcmu-runner for non-container
Only rbd-target-api and rbd-target-gw were started/enabled for non
containerized deployment.
The issue doesn't happen with containerized setup.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4ae5ce399b)
2019-04-29 23:03:59 +00:00
Rishabh Dave ebd2ae520d ceph-config: remove redundant condition on a block
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-25 13:51:58 +02:00
Rishabh Dave cad35d5c52 "when" keyword should precede "block" keyword
Otherwise the reader is forced to search for "when" when blocks are too
long.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e0beaf123a)

Conflicts:
	roles/ceph-config/tasks/main.yml
	roles/ceph-container-common/tasks/pre_requisites/prerequisites.yml
	roles/ceph-validate/tasks/check_devices.yml
2019-04-24 16:25:43 +02:00
Kyle Bader cd0eddc460 rgw: add cpuset support
1/ The OSD already supports cpuset to be used for containerized deployments
through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar
support to the RGW service for containerized deployments by setting a new
variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where
using distinct cores has advantages over using the CFS in kernel scheduler.

ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids

2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory
allocations to a particular numa node, which should typically correspond with
the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure
the correct cpu ids are used one can run `numactl --hardware`  to list the nodes
and which cpu ids correspond to each.

Signed-off-by: Kyle Bader <kbader@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0bee90b201)
2019-04-23 09:09:32 +02:00
Radu Toader 6e02e5faae Allow CephFS pool to be created with specific rule_name, erasure_profile just like rbd pools
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
(cherry picked from commit b2f242660e)
2019-04-20 06:40:08 +00:00
Dimitri Savineau f770917517 ceph-container-common: modify requirement flow
Until now it was not possible to install a specific container package
because it was somehow hardcoded.
This patch allows to override the container package name (docker.io
vs docker-ce) and refacts the package installation. This could be
achieve via the container_package_name variable.
Instead of using one task per distribution we can set the package and
service name in vars. This allows to have a unified package task.
Also refactorize the debian_prerequisites tasks because the content
was outdated.

https://docs.docker.com/install/linux/docker-ce/debian/
https://docs.docker.com/install/linux/docker-ce/ubuntu/

Resolves: #3609

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8105a1cefb)
2019-04-19 04:07:22 +00:00
Andrew Schoen 545d93aae8 rolling_update: set num_osds to the number of running osds
We do this so that the ceph-config role can most accurately
report the number of osds for the generation of the ceph.conf
file.

We don't want to use ceph-volume to determine the number of
osds because in an upgrade to nautilus ceph-volume won't be able to
accurately count osds created by ceph-disk.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 67453853ff)
2019-04-18 19:12:13 +02:00
Andrew Schoen 1e0e50fc90 ceph-osd: do not run lvm batch tasks during update
When performing a rolling update do not try to create
any new osds with `ceph-volume lvm batch`. This is troublesome
because when upgrading to nautilus the devices list might contain
devices that are currently being used by ceph-disk and have GPT
headers on them, which will cause ceph-volume to fail when
trying to use such a device. Any devices originally created
by ceph-disk will need to be removed from the devices list
before any new osds can be created.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 5e3dfe5021)
2019-04-18 19:12:13 +02:00
Dimitri Savineau 2d3c636fa8 ceph-mgr: Add extra module packages
Since Nautilus there's mgr extra modules not present in ceph-mgr
package but in dedicated packages.

Resolves: #3860

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 86315272c7)
2019-04-18 19:10:31 +02:00
Guillaume Abrioux b4377f6163 update: refact msgr2 migration
this commit refact the msgr2 protocol introduction.

If it's a fresh install, let's go with v2 only.
If we upgrade to nautilus, we should go with v2+v1 syntax to ensure
nothing breaks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a4bc7bda51)
2019-04-18 19:10:10 +02:00
Dimitri Savineau 84d6bb226b ceph-iscsi-gw: Remove library directory
The library directory that contain the custom ceph modules in present
in the ceph-ansible root directory.
All igw_* mocules are already present there so we don't need the one
present in roles/ceph-iscsi-gw/library.
Also remove the associated spec file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c8814d1331)
2019-04-18 16:32:58 +02:00
Guillaume Abrioux 6b5487d1e5 mds: remove legacy task
this task has nothing to do in stable-4.0 and after.
Let's remove it since stable-4.0 and after aren't intended to deploy
luminous.

Closes: #3873

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 58f3851573)
2019-04-18 10:15:43 -04:00
Dimitri Savineau 8edb064606 allow using ansible 2.8
Currently we only support ansible 2.7
We plan to use 2.8 when it will be release so we have to support both
2.7 and 2.8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1700548

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e471bce76b)
2019-04-17 18:14:58 +02:00
Guillaume Abrioux 3787c9b7ad defaults: refact package dependencies installation.
Because 5c98e361df could be seen as a non
backward compatible change this commit reverts it and bring back package
dependencies installation support.
Let's just modify the default value instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit edfa4310d3)
2019-04-16 12:06:25 -04:00
Guillaume Abrioux 5aca0996ed defaults: remove some package dependencies
These packages aren't needed anymore.
They were needed for ceph-init-detect buti as of ceph-init-detect doesn't exist
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683885

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5c98e361df)
2019-04-16 12:06:25 -04:00
Rishabh Dave a3e4bf3796 check if mon daemon is installed before restarting it
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 96c180cc0e)
2019-04-16 11:14:21 +02:00
Guillaume Abrioux f8b69694cc mon: check if an initial monitor keyring already exists
When adding a new monitor, we must reuse the existing initial monitor
keyring. Otherwise, the new monitor will issue its 'mkfs' with a new
monitor keyring and it will result with a mismatch between them. The
new monitor will be unable to join the quorum in the end.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit edf1ee2073)
2019-04-16 11:14:21 +02:00
Guillaume Abrioux 22d39591a4 osd: remove legacy file
this file is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f899da3172)
2019-04-12 00:45:21 +00:00
Guillaume Abrioux 692b1a8b9f osd: remove ceph-disk scenarios files
these files aren't needed anymore since we only use lvm scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4f68462009)
2019-04-12 00:45:21 +00:00
Guillaume Abrioux 41e55a840f osd: remove dedicated_devices variable
This variable was related to ceph-disk scenarios.
Since we are entirely dropping ceph-disk support as of stable-4.0, let's
remove this variable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f0416c8892)
2019-04-12 00:45:21 +00:00
Guillaume Abrioux 4a663e1fc0 osd: remove variable osd_scenario
As of stable-4.0, the only valid scenario is `lvm`.
Thus, this makes this variable useless.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4d35e9eeed)
2019-04-12 00:45:21 +00:00
Guillaume Abrioux 948a5e802e osd: remove legacy file
ceph_disk_cli_options_facts.yml is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4d5637fd8a)
2019-04-12 00:45:21 +00:00
Sébastien Han 89463939f2 validate: only check device when they are devices
We only validate the devices that are passed if there is a list of
devices to validate.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 2888c0825f)
2019-04-12 00:45:21 +00:00
Sébastien Han 343a99c8b7 osd: default osd_scenario to lvm
osd_scenario has become obsolete and defaults to lvm. With lvm there is
no such things has collocated and non-collocated.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 52df15895b)
2019-04-12 00:45:21 +00:00
Sébastien Han 279044155f validate: print a message for old scenarios
ceph-disk is not supported anymore, so all the newly created OSDs will
be configured using ceph-volume.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 9ea1e49407)
2019-04-12 00:45:21 +00:00
Sébastien Han 11c6655f57 osd: remove ceph-disk support
We don't support the preparation of OSD with ceph-disk. ceph-volume is
only supported. However, the start operation of OSD is still supported.
So let's say you change a config option, the handlers will be able to
restart all the OSDs via their respective systemd unit files.

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e2a5aa062e)
2019-04-12 00:45:21 +00:00
Dimitri Savineau c9a3def3a6 ceph-mds: Set application pool to cephfs
We don't need to use the cephfs variable for the application pool
name because it's always cephfs.
If the cephfs variable is set to something else than the default
value it will break the appplication pool task.

Resolves: #3790

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d2efb7f02b)
2019-04-11 17:47:21 +02:00
Guillaume Abrioux f5f8d264e2 osds: allow passing devices by path
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible

Closes: #3812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e0adca7a4)
2019-04-11 02:25:15 +00:00
Dimitri Savineau 1e944b6022 rgw: change default frontend on nautilus
As discussed in ceph/ceph#26599, beast is now the default frontend
for rados gateway with nautilus release.
Add rgw_thread_pool_size variable with 512 as default value and keep
backward compatibility with num_threads option when using civetweb.
Update radosgw_civetweb_num_threads to reflect rgw_thread_pool_size
change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d17b1b48b6)
2019-04-10 14:42:33 -04:00
Guillaume Abrioux a718ddec50 mon: remove useless delegate_to
Let's use a condition to run this task only on the first mon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 631e5d3144)
2019-04-10 09:52:29 +00:00
Matthew Vernon a4d75c6ea6 UCA: Uncomment UCA variables in defaults, fix consequent breakage
The Ubuntu Cloud Archive-related (UCA) defaults in
roles/ceph-defaults/defaults/main.yml were commented out, which means
if you set `ceph_repository` to "uca", you get undefined variable
errors, e.g.

```
The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined

The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- name: add ubuntu cloud archive repository
  ^ here

```

Unfortunately, uncommenting these results in some other breakage,
because further roles were written that use the fact of
`ceph_stable_release_uca` being defined as a proxy for "we're using
UCA", so try and install packages from the bionic-updates/queens
release, for example, which doesn't work. So there are a few `apt` tasks
that need modifying to not use `ceph_stable_release_uca` unless
`ceph_origin` is `repository` and `ceph_repository` is `uca`.

Closes: #3475
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 9dd913cf8a)
2019-04-10 03:50:27 +00:00
Dimitri Savineau 4cc318d13c container-common: Enable docker on boot for ubuntu
docker daemon is automatically started during package installation
but the service isn't enabled on boot.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 37816570c6)
2019-04-10 00:02:35 +00:00
Rishabh Dave c60915733a allow adding a MDS to already deployed cluster
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c0dfa9b61a)
2019-04-09 16:48:59 +02:00
Dimitri Savineau 8715490223 ceph-facts: use last ipv6 address for mon/rgw
When using monitor_address_block or radosgw_address_block variables
to configure the mon/rgw address we're getting the first ip address
from the ansible facts present in that cidr.
When there's VIP on that network the first filter could return the
wrong value.
This seems to affect only IPv6 setup because the VIP addresses are
added to the ansible facts at the beginning of the list. This is the
opposite (at the end) when using IPv4.
This causes the mon/rgw processes to bind on the VIP address.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680155

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fd4b0ec7eb)
2019-04-09 10:48:14 -04:00
François Lafont af78673328 ceph-rgw: Fix bad paths which depend on the clustername
The path of the RGW environment file (in the /var/lib/ceph/radosgw/
directory) depends on the Ceph clustername. It was not taken into
account in the Ansible role `ceph-rgw`.

Signed-off-by: flaf <francois.lafont.1978@gmail.com>
(cherry picked from commit 4c3e77d869)
2019-04-09 10:44:45 -04:00
Guillaume Abrioux bf672f14fe mgr: manage mgr modules when mgr and mon are collocated
When mgrs are implicitly collocated on monitors (no mgrs in mgrs group).
That include was skipped because of this condition :

`inventory_hostname == groups[mgr_group_name][0]`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cbfdbab177)
2019-04-09 10:59:32 +02:00
Guillaume Abrioux 3272c2347f mgr: wait for all mgr to be available
before managing mgr modules, we must ensure all mgr are available
otherwise we can hit failure like following:

```
stdout:Error ENOENT: all mgr daemons do not support module 'restful', pass --force to force enablement
```

It happens because all mgr are not yet available when trying to manage
with mgr modules.

Closes: #3100

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f596cc1711)
2019-04-09 10:59:32 +02:00
Ali Maredia 4b35360876 rgw multisite: add more than 1 rgw to the master or secondary zone
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 37f46a8c5d)
2019-04-07 10:00:18 +00:00
fpantano f8cbc27a83 Check ceph_health_raw.stdout value as string during mon bootstrap
According to rdo testing https://review.rdoproject.org/r/#/c/18721
a check on the output of the ceph_health value is added to
allow the playbook to make several attempts (according to the
retry/delay variables) when waiting the cluster quorum or
when the container bootstrap is not ended.
It avoids the failure of the command execution when it doesn't
receive a valid json object to decode (because cluster is too
slow to boostrap compared to ceph-ansible task execution).

Signed-off-by: fpantano <fpantano@redhat.com>
(cherry picked from commit afbb90e4ac)
2019-04-04 19:15:55 +02:00
Dimitri Savineau ace23a1479 radosgw: Raise cpu limit to 8
In containerized deployment the default radosgw quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680171

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d3ae9fd05f)
2019-04-04 19:15:01 +02:00
Dimitri Savineau 0274b880f1 ceph-volume: Add PYTHONIOENCODING env variable
Since https://github.com/ceph/ceph/commit/77912c0 ceph-volume uses
stdout encoding based on LC_CTYPE and PYTHONIOENCODING environment
variables.
Thoses variables aren't set when using ansible.
Currently this commit breaks non containerized deployment on Ubuntu.

TASK [use ceph-volume to create bluestore osds] ********************
  cmd:
  - ceph-volume
  - --cluster
  - ceph
  - lvm
  - create
  - --bluestore
  - --data
  - /dev/sdb
  rc: 1
  stderr: |-
    Traceback (most recent call last):
    (...)
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in
    position 132: ordinal not in range(128)

Note that the task is failing on ansible side due to the stdout
decoding but the osd creation is successful.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7e5e4229b7)
2019-04-03 11:27:46 +02:00
Guillaume Abrioux f55e2b08be remove all NBSPs on master branch
Similar to #3658

Since there's too many changes between master and stable branches let's
commit directly in each branches instead of trying to backport this
commit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-28 11:57:55 +00:00
Dimitri Savineau 40a8e1160c container: Add python3-docker on Ubuntu bionic
When installing python-minimal on Ubuntu bionic, this will add the
/usr/bin/python symlink to the default python interpreter.
On bionic, this isn't python2 but python3.

$ /usr/bin/python --version
Python 3.6.7

The python docker library is only installed for python2 which causes
issues when running the purge-docker-cluster playbook. This playbook
uses the ansible docker modules and requires to have python bindings
installed on the remote host.
Without the bindings we can see python error reported by the docker
module.

msg: Failed to import docker or docker-py - No module named 'docker'.
Try `pip install docker` or `pip install docker-py` (Python 2.6)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-28 08:03:58 +00:00
Guillaume Abrioux 6f47c20c3a rgw: fix a typo
ee2d52d33d introduced a typo.
This commit fixes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 3c4f464c54 rgw: cleanup legacy task
this task was here for backward compatibility.
It's time to remove it in the next release.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 9134624578 rgw: add a retry on pool related tasks
sometimes those tasks might fail because of a timeout.
I've been facing this several times in the CI, adding this retry might
help and won't hurt in any case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux f6e0185146 update: add containerized deployment upgrade support (L->N)
Add a couple of fixes to allow containerized deployments upgrade support
to upgrade from luminous/mimic to nautilus.

- pass CEPH_CONTAINER_IMAGE and CEPH_CONTAINER_BINARY environment
variable to the ceph_key module,
- fix the docker exec command in 'waiting for the containerized monitor
to join the quorum' task according to the `delegate_to` parameter,
- override `docker_exec_cmd` in `ceph-facts` with `mon_host` when
rolling_update is `True`,
- do not run unnecessarily `create_mds_filesystems.yml` when performing an
upgrade.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 7386249c71 facts: retrieve fsid during rolling_update playbook
otherwise it generates a new cluster fsid and makes the upgrade failing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 5c3ce4ca77 mon: fetch initial keyring even when running rolling_update
otherwise, the task to copy mgr keyring fails during the rolling_update.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux afdaa70a63 update: enable msgr2 protocol
This commit enable the msgr2 protocol when the cluster is fully upgraded
to nautilus

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 82764afe8d update: mask systemd service units during upgrade
This prevents the packaging from restarting services before we do need
to restart them in the rolling update sequence.
We want to handle services restart at rolling_update playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux b4f14aba8e ceph_key: `lookup_ceph_initial_entities` shouldn't fail on update
As of nautilus, the initial keyrings list has changed, it means when
upgrading from Luminous or Mimic, it is expected there's a mismatch
between what is found on the cluster and the expected initial keyring
list hardcoded in ceph_key module. We shouldn't fail when upgrading to
nautilus.

str_to_bool() took from ceph-volume.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-Authored-by: Alfredo Deza <adeza@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux e99305c684 handlers: do not trigger handlers on rolling_update
rolling_update playbook already takes care of stopping/starting services
during the sequence. There's no need to trigger potential unwanted
services restart.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Dimitri Savineau 179fdfbc19 ceph-osd: Ensure lvm2 is installed
When using osd_scenario lvm, we never check if the lvm2 package is
present on the host.
When using containerized deployment and docker on CentOS/RedHat this
package will be automatically installed as a dependency but not for
Ubuntu distribution.
OSD deployed via ceph-volume require the lvmetad.socket to be active
and running.

Resolves: #3728

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-20 22:26:45 +00:00
Guillaume Abrioux 987bdac963 osd: backward compatibility with old disk_list.sh location
Since all files in container image have moved to `/opt/ceph-container`
this check must look for new AND the old path so it's backward
compatible. Otherwise it could end up by templating an inconsistent
`ceph-osd-run.sh`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-18 17:25:51 +00:00
Dimitri Savineau 5c39735be5 ceph-validate: fail if there's no ipaddr available in monitor_address_block subnet
When using monitor_address_block to determine the ip address of the
monitor node, we need an ip address available in that cidr to be
present in the ansible facts (ansible_all_ipv[46]_addresses).
Currently we don't check if there's an ip address available during
the ceph-validate role.
As a result, the ceph-config role fails due to an empty list during
ceph.conf template creation but the error isn't explicit.

TASK [ceph-config : generate ceph.conf configuration file] *****
fatal: [0]: FAILED! => {"msg": "No first item, sequence was empty."}

With this patch we will fail before the ceph deployment with an
explicit failure message.

Resolves: rhbz#1673687

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-18 16:35:36 +00:00
Dimitri Savineau a7b1e35a16 ceph-common: Install yum plugin priorities
When using community repository we need to set the priority on the
ceph repositories because we could have some conflict with EPEL
packages.
In order to set the priority on the ceph repositories, we need to
install the yum-plugin-priorities package.

http://docs.ceph.com/docs/master/install/get-packages/#rpm-packages

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-16 06:24:55 +00:00
wumingqiao 31617afca9 ceph-mgr: run mgr_modules.yml only on the first mgr host
the task will be delegated to mons[0] for all mgr hosts, so we can just run it on the first host and have the same effect.

Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
2019-03-14 20:16:33 +00:00
Dimitri Savineau d8538ad4e1 Set the default crush rule in ceph.conf
Currently the default crush rule value is added to the ceph config
on the mon nodes as an extra configuration applied after the template
generation via the ansible ini module.

This implies two behaviors:

1/ On each ceph-ansible run, the ceph.conf will be regenerated via
ceph-config+template and then ceph-mon+ini_file. This leads to a
non necessary daemons restart.

2/ When other ceph daemons are collocated on the monitor nodes
(like mgr or rgw), the default crush rule value will be erased by
the ceph.conf template (mon -> mgr -> rgw).

This patch adds the osd_pool_default_crush_rule config to the ceph
template and only for the monitor nodes (like crush_rules.yml).
The default crush rule id is read (if exist) from the current ceph
configuration.
The default configuration is -1 (ceph default).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1638092

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-14 08:56:52 +00:00
Dimitri Savineau b7f4e3e7c7 ceph-osd: Install numactl package when needed
With 3e32dce we can run OSD containers with numactl support.
When using numactl command in a containerized deployment we need to
be sure that the corresponding package is installed on the host.
The package installation is only executed when the
ceph_osd_numactl_opts variable isn't empty.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-12 07:43:06 +00:00
Guillaume Abrioux b3eb9206fa osd: support numactl options on OSD activate
This commit adds OSD containers activate with numactl support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1684146

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-11 10:14:50 +01:00
Dimitri Savineau a089e1ec23 systemd/service: Set docker.service conditionally
We don't need to set After=docker.service when the container_binary
variable isn't set to docker.
It doesn't break anything currently but it could be confusing when
using podman.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 20:56:11 +00:00
Dimitri Savineau d6e71d769c common: Use rhsm_repository module for RHCS
Instead of using subscription-manager with command module we can use
the rhsm_repository ansible module.
This module already uses repos list feature to determine if a
repository is enabled or not. That way this module is idempotent so
we don't need changed_when: false anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 19:15:42 +00:00
Dimitri Savineau 53514a5b50 common: Add noarch to community repository
The ceph stable community repository only enables the basearch
packages url.
Adding the noarch url because starting with nautilus release, some
packages are added there and useful for mgr or grafana.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-06 00:25:11 +00:00
Dimitri Savineau 4d32ecc980 Force osd pool min_size value to integer
After b8d580b and e9e5d5a we could have either item.min_size or
osd_pool_default_min_size using string instead of int causing the
condition to be true when it's false.
As a result, the task could try to set the pool min_size value to
0 which leads to:

Error EINVAL: pool min_size must be between 1 and 1

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 19:48:09 +00:00
Dimitri Savineau cb381b41fe Add CONTAINER_IMAGE env var to ceph daemons
Ceph daemons will set the CONTAINER_IMAGE environment variable value
in the daemon metadata.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 15:07:05 +00:00
Guillaume Abrioux e9e5d5a39a fix pool min_size customization
b8d580b3f4 introduced a bug when
`min_size` isn't set (default to 0).

Typical error:

```
Error EINVAL: pool min_size must be between 1 and 1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-05 13:29:34 +00:00
Radu Toader b8d580b3f4 Customize pools min_size
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 10:57:15 +00:00
Radu Toader 2048255f61 When creating pool, read pool.application and make the call to ceph osd pool enable application
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 09:16:03 +00:00
Kevin Coakley b11dc13476 Updated 7 ansible-lint issues in the ceph-mon, ceph-osd, and ceph-rgw roles
The following lint issues have been resolved:

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2

[305] Use shell only when shell functionality is required
/home/travis/build/ceph/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:47

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:2

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:7

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:14

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:19

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:24

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-03-04 22:25:35 +00:00
Guillaume Abrioux 359f8a9a4a nfs: fix systemd template service for ubuntu
`mkdir` is located in `/bin` on Ubuntu.
Let's use some jinja to support Ubuntu.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 19:54:25 +00:00
Dimitri Savineau 45a7082712 lint: Fix spaces before and after variables
ansible-lint reports:

[206] Variables should have spaces after {{ and before }}

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-01 17:22:24 +00:00
VasishtaShastry 34c25ef49b Extends check_devices tasks to non-collocated an lvm-batch scenarios
Tuned name of a task and error message to make it more user understandable

Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
2019-03-01 02:13:51 +00:00
Kevin Coakley 038401fef2 Add changed_when: false to the "get osd ids" statement
The "get osd ids" statement only registers the osd_ids_non_container variable. Running "ls /var/lib/ceph/osd/ | sed 's/.*-//'" should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-28 22:46:19 +00:00
ToprHarley 573adce7dd Convert interface names to underscores
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881

Signed-off-by: Tomas Petr <tpetr@redhat.com>
2019-02-28 17:07:34 +00:00
Guillaume Abrioux d5be83e504 osd: add ipc=host in systemd template for containers
in addition to 15812970f0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 13:14:09 +00:00
fpantano 21fad7ced3 Removed not needed mountpoint and removed ubuntu section
Referring to BZ#1683290, as dsavineau suggests, being this
bug tripleO specific, removed the ubuntu section and removed
useless mountpoints.

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
fpantano 0c1944236b Added to the ceph-radosgw service template the ca-trust
volume avoiding to expose useless information.
This bug is referred to the following bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=1683290

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
Dimitri Savineau 58a9d310d5 mon: Move client admin variable to defaults
There's no need to set the client_admin_ceph_authtool_cap variable
via a set_fact task.
Instead we can set this in the role defaults.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Dimitri Savineau dd7b7604de mon: Add mds permissions to client.admin
The administrator keyring needs full capabilities on mds like mon,
osd and mgr.
Whithout this, the client.admin key won't be able to run commands
against mds (like ceph tell mds.0 session ls)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Guillaume Abrioux 4ab02d2cd1 tests: set ceph_origin and ceph_repository for non_container-collocation
those variables are mandatory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Guillaume Abrioux f68ad10bc9 mon: do not create unnecessarily mgr keyrings
there's no need to generate mgr keyrings 'mgr.monX' when mgrs aren't
collocated with monitors.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Kevin Coakley d327681b99 Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive
Set directories to 755 and files to 644 to /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of setting files and directories to 755 recursively. The ceph mon process writes files to this path with permissions 644. This update stops ansible from updating the permissions in /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes a file and increases idempotency.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-27 10:48:19 +00:00
Dimitri Savineau dc1c0dcee2 ceph-osd: Drop memory flag with bluestore
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-26 07:27:06 +00:00
Guillaume Abrioux 8f42007272 facts: fix auto_discovery exclude
the previous approach was wrong.
checking if `item.key` is in `osd_auto_discovery_exclude` (`['dm-',
'loop']`) is incorrect because it will obviously not match. Therefore,
the condition will return `True` whatever the device we are checking.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 03:16:33 +00:00
Guillaume Abrioux 83d7ef777e osd: add possibility to exclude device in osd_auto_discovery
Add a new `osd_auto_discovery_exclude` to give the possibility of
excluding some devices in auto_discovery scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-25 10:05:34 +00:00
Dimitri Savineau b7338d438a ceph-infra: Remove restart firewalld handler
There's no need to restart firewalld service when a new rule is
added due to the usage of the immediate flag.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-22 18:15:47 +00:00
Guillaume Abrioux 2b60a35634 common: do not override ceph_release when ceph_repository is 'rhcs'
We shouldn't reset `ceph_release` with `ceph_stable_release` when
`ceph_repository` is `rhcs`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-21 11:58:49 +01:00
Guillaume Abrioux 21e5db8982 osd: make the 'wait for all osd to be up' task configurable
introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-20 16:06:04 +00:00
Patrick Donnelly ed40c5237d delegate key creation to first mon
Otherwise keys get scattered over the mons and the mgr key is not copied properly.

With ansible_inventory:

    [mdss]
            mds-000 ansible_ssh_host=192.168.129.110 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [clients]
            client-000 ansible_ssh_host=192.168.143.94 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [mgrs]
            mgr-000 ansible_ssh_host=192.168.222.195 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [mons]
            mon-000 ansible_ssh_host=192.168.139.173 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.139.173
            mon-002 ansible_ssh_host=192.168.212.114 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.212.114
            mon-001 ansible_ssh_host=192.168.167.177 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.167.177
    [osds]
            osd-001 ansible_ssh_host=192.168.178.128 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
            osd-000 ansible_ssh_host=192.168.138.233 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
            osd-002 ansible_ssh_host=192.168.197.23 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'

We get this failure:

    TASK [ceph-mon : include_tasks ceph_keys.yml] **********************************************************************************************************************************************************************
    included: /root/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml for mon-000, mon-002, mon-001

    TASK [ceph-mon : waiting for the monitor(s) to form the quorum...] *************************************************************************************************************************************************
    changed: [mon-000] => {
        "attempts": 1,
        "changed": true,
        "cmd": [
            "ceph",
            "--cluster",
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li1166-30/keyring",
            "mon_status",
            "--format",
            "json"
        ],
        "delta": "0:00:01.897397",
        "end": "2019-02-14 17:08:09.340534",
        "rc": 0,
        "start": "2019-02-14 17:08:07.443137"
    }

    STDOUT:

    {"name":"li1166-30","rank":0,"state":"leader","election_epoch":4,"quorum":[0,1,2],"quorum_age":0,"features":{"required_con":"2449958747315912708","required_mon":["kraken","luminous","mimic","osdmap-prune","nautilus"],"quorum_con":"4611087854031667199","quorum_mon":["kraken","luminous","mimic","osdmap-prune","nautilus"]},"outside_quorum":[],"extra_probe_peers":[{"addrvec":[{"type":"v2","addr":"192.168.167.177:3300","nonce":0},{"type":"v1","addr":"192.168.167.177:6789","nonce":0}]},{"addrvec":[{"type":"v2","addr":"192.168.212.114:3300","nonce":0},{"type":"v1","addr":"192.168.212.114:6789","nonce":0}]}],"sync_provider":[],"monmap":{"epoch":1,"fsid":"bb401e2a-c524-428e-bba9-8977bc96f04b","modified":"2019-02-14 17:08:05.012133","created":"2019-02-14 17:08:05.012133","features":{"persistent":["kraken","luminous","mimic","osdmap-prune","nautilus"],"optional":[]},"mons":[{"rank":0,"name":"li1166-30","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.139.173:3300","nonce":0},{"type":"v1","addr":"192.168.139.173:6789","nonce":0}]},"addr":"192.168.139.173:6789/0","public_addr":"192.168.139.173:6789/0"},{"rank":1,"name":"li985-128","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.167.177:3300","nonce":0},{"type":"v1","addr":"192.168.167.177:6789","nonce":0}]},"addr":"192.168.167.177:6789/0","public_addr":"192.168.167.177:6789/0"},{"rank":2,"name":"li895-17","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.212.114:3300","nonce":0},{"type":"v1","addr":"192.168.212.114:6789","nonce":0}]},"addr":"192.168.212.114:6789/0","public_addr":"192.168.212.114:6789/0"}]},"feature_map":{"mon":[{"features":"0x3ffddff8ffacffff","release":"luminous","num":1}],"client":[{"features":"0x3ffddff8ffacffff","release":"luminous","num":1}]}}

    TASK [ceph-mon : fetch ceph initial keys] **************************************************************************************************************************************************************************
    changed: [mon-001] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li985-128/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.179584",
        "end": "2019-02-14 17:08:14.305348",
        "rc": 0,
        "start": "2019-02-14 17:08:11.125764"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw
    changed: [mon-002] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li895-17/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.706169",
        "end": "2019-02-14 17:08:14.041698",
        "rc": 0,
        "start": "2019-02-14 17:08:10.335529"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw
    changed: [mon-000] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li1166-30/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.916467",
        "end": "2019-02-14 17:08:13.803999",
        "rc": 0,
        "start": "2019-02-14 17:08:09.887532"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw

    TASK [ceph-mon : create ceph mgr keyring(s)] ***********************************************************************************************************************************************************************
    skipping: [mon-000] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-000)  => {
        "changed": false,
        "item": "mon-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-002)  => {
        "changed": false,
        "item": "mon-002",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-001)  => {
        "changed": false,
        "item": "mon-001",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-000)  => {
        "changed": false,
        "item": "mon-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-002)  => {
        "changed": false,
        "item": "mon-002",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-001)  => {
        "changed": false,
        "item": "mon-001",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=mgr-000) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li547-145.keyring"
        ],
        "delta": "0:00:05.822460",
        "end": "2019-02-14 17:08:21.422810",
        "item": "mgr-000",
        "rc": 0,
        "start": "2019-02-14 17:08:15.600350"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-000) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li1166-30.keyring"
        ],
        "delta": "0:00:05.814039",
        "end": "2019-02-14 17:08:27.663745",
        "item": "mon-000",
        "rc": 0,
        "start": "2019-02-14 17:08:21.849706"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-002) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li895-17.keyring"
        ],
        "delta": "0:00:05.787291",
        "end": "2019-02-14 17:08:33.921243",
        "item": "mon-002",
        "rc": 0,
        "start": "2019-02-14 17:08:28.133952"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-001) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li985-128.keyring"
        ],
        "delta": "0:00:05.782064",
        "end": "2019-02-14 17:08:40.138706",
        "item": "mon-001",
        "rc": 0,
        "start": "2019-02-14 17:08:34.356642"
    }

    STDERR:

    imported keyring

    TASK [ceph-mon : copy ceph mgr key(s) to the ansible server] *******************************************************************************************************************************************************
    skipping: [mon-000] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=mgr-000) => {
        "changed": true,
        "checksum": "aa0fa40225c9e09d67fe7700ce9d033f91d46474",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/etc/ceph/ceph.mgr.li547-145.keyring",
        "item": "mgr-000",
        "md5sum": "cd884fb9ddc9b8b4e3cd1ad6a98fb531",
        "remote_checksum": "aa0fa40225c9e09d67fe7700ce9d033f91d46474",
        "remote_md5sum": null
    }

    TASK [ceph-mon : copy keys to the ansible server] ******************************************************************************************************************************************************************
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/etc/ceph/ceph.client.admin.keyring)  => {
        "changed": false,
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/etc/ceph/ceph.client.admin.keyring)  => {
        "changed": false,
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring) => {
        "changed": true,
        "checksum": "095c7868a080b4c53494335d3a2223abbad12605",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "md5sum": "d8f4c4fa564aade81b844e3d92c7cac6",
        "remote_checksum": "095c7868a080b4c53494335d3a2223abbad12605",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring) => {
        "changed": true,
        "checksum": "ce7a2d4441626f22e995b37d5131b9e768f18494",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "md5sum": "271e4f90c5853c74264b6b749650c3f2",
        "remote_checksum": "ce7a2d4441626f22e995b37d5131b9e768f18494",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring) => {
        "changed": true,
        "checksum": "e35e8613076382dd3c9d89b5bc2090e37871aab7",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "md5sum": "ed7c32277914c8e34ad5c532d8293dd2",
        "remote_checksum": "e35e8613076382dd3c9d89b5bc2090e37871aab7",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring) => {
        "changed": true,
        "checksum": "ac43101ad249f6b6bb07ceb3287a3693aeae7f6c",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "md5sum": "1460e3c9532b0b7b3a5cb329d77342cd",
        "remote_checksum": "ac43101ad249f6b6bb07ceb3287a3693aeae7f6c",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring) => {
        "changed": true,
        "checksum": "01d74751810f5da621937b10c83d47fc7f1865c5",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "md5sum": "979987f10fd7da5cff67e665f54bfe4d",
        "remote_checksum": "01d74751810f5da621937b10c83d47fc7f1865c5",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/etc/ceph/ceph.client.admin.keyring) => {
        "changed": true,
        "checksum": "482f702cf861b41021d76de655ecf996fe9a4a4a",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/etc/ceph/ceph.client.admin.keyring",
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "md5sum": "7581c187044fd4e0f7a5440244a6b306",
        "remote_checksum": "482f702cf861b41021d76de655ecf996fe9a4a4a",
        "remote_md5sum": null
    }

    TASK [ceph-mon : include secure_cluster.yml] ***********************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mon : crush_rules.yml] **********************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : set_fact docker_exec_cmd] *************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : include common.yml] *******************************************************************************************************************************************************************************
    included: /root/ceph-ansible/roles/ceph-mgr/tasks/common.yml for mon-000, mon-002, mon-001

    TASK [ceph-mgr : create mgr directory] *****************************************************************************************************************************************************************************
    changed: [mon-000] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li1166-30",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }
    changed: [mon-002] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li895-17",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }
    changed: [mon-001] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li985-128",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }

    TASK [ceph-mgr : fetch ceph mgr keyring] ***************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : copy ceph keyring(s) if needed] *******************************************************************************************************************************************************************
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-002] (item={'name': '/etc/ceph/ceph.mgr.li895-17.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li895-17/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li895-17/keyring",
            "name": "/etc/ceph/ceph.mgr.li895-17.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-002] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-001] (item={'name': '/etc/ceph/ceph.mgr.li985-128.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li985-128/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li985-128/keyring",
            "name": "/etc/ceph/ceph.mgr.li985-128.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-001] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-000] (item={'name': '/etc/ceph/ceph.mgr.li1166-30.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li1166-30/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li1166-30/keyring",
            "name": "/etc/ceph/ceph.mgr.li1166-30.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-000] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }

    NO MORE HOSTS LEFT *************************************************************************************************************************************************************************************************
     to retry, use: --limit @/root/ceph-linode/linode.retry

    PLAY RECAP *********************************************************************************************************************************************************************************************************
    client-000                 : ok=30   changed=2    unreachable=0    failed=0
    mds-000                    : ok=32   changed=4    unreachable=0    failed=0
    mgr-000                    : ok=32   changed=4    unreachable=0    failed=0
    mon-000                    : ok=89   changed=21   unreachable=0    failed=1
    mon-001                    : ok=84   changed=20   unreachable=0    failed=1
    mon-002                    : ok=81   changed=17   unreachable=0    failed=1
    osd-000                    : ok=32   changed=4    unreachable=0    failed=0
    osd-001                    : ok=32   changed=4    unreachable=0    failed=0
    osd-002                    : ok=32   changed=4    unreachable=0    failed=0

Also, create all keys on the first mon and copy those to the other mons to be
consistent.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-02-20 11:19:44 +01:00
David Waiting 3930791cb7 ensure at least one osd is up
The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.

The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).

This is related to Bugzilla 1578086.

In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.

Signed-off-by: David Waiting <david_waiting@comcast.com>
2019-02-19 18:31:05 +00:00
Guillaume Abrioux c98fd0b9e0 facts: ensure ceph_uid is set when running rhel
when hosts is running on RHEL, let's enforce ceph_uid to 167.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux e7034402a4 container: fix tmpfiles.d ceph files
- fix uid/gid in ceph tmpfiles
- move file to `/etc/tmpfiles.d`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux d4e31b90a6 Revert "osd: container remove --pid=host"
This reverts commit bb2bbeb941.

Looks like when not passing `--pid=host` we are facing some issues when
deploying more than 2 OSDs in containerized environment.

At the moment, we are still troubleshooting this issue but we prefer to
revert this commit so it doesn't block any PR in the CI.

As soon as we have a fix; we will push a new PR to remove `--pid=host`
(a revert of revert...)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 500256cdab validate: fix ntp_daemon_type check in validate
is_atomic is defined in ceph-facts or very early in main playbook.

In non containerized deployment, is_atomic is only set in ceph-facts
which is played after ceph-validate.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 76303b457c container: create ceph-common.conf tmpfiles.d if it doesn't exist
Otherwise the task will fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux b24202f6a4 facts: move two set_fact into ceph-facts
those two set_fact tasks should be moved in ceph-facts.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 8c8ec63633 container: use tmpfiles.d to creates /run/ceph
instead of using `RuntimeDirectory` parameter in systemd unit files,
let's use a systemd `tmpfiles.d` to ensure `/run/ceph`.

Explanation:

`podman` doesn't create the `/var/run/ceph` if it doesn't exist the time
where the container is run while `docker` used to create it.
In case of `switch_to_containers` scenario, `/run/ceph` gets created by
a tmpfiles.d systemd file; when switching to containers, the systemd
unit file complains because `/run/ceph` already exists

The better fix would be to ensure `/usr/lib/tmpfiles.d/ceph-common.conf`
is removed and only rely on `RuntimeDirectory` from systemd unit file parameter
but we come from a non-containerized environment which is already running,
it means `/run/ceph` is already created and when starting the unit to
start the container, systemd will still complain and we can't simply
remove the directory if daemons are collocated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 7e0a70f7a8 switch_to_containers: do not try to redeploy monitors
`ceph-mon` tries to redeploy monitors because it assumes it was not yet
deployed since `mon_socket_stat` and `ceph_mon_container_stat` are
undefined (indeed, we stop the daemon before calling `ceph-mon` in the
switch_to_containers playbook).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Rishabh Dave 05ea783eff fix mistake in task that aborts when ntpd is chosen on Atomic
Since it's already confusing whether ntp_daemon_type should be "ntp" or
"ntpd", fix the mistake in the title of the task that aborts if
ntp_daemon_type is set to "ntpd" and OS being used is Atomic.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-02-12 09:09:27 +01:00
Rishabh Dave bdff3e48fd don't install NTPd on Atomic
Since Atomic doesn't allow any installations and NTPd is not present
on Atomic image we are using, abort when ntp_daemon_type is set to ntpd.

https://github.com/ceph/ceph-ansible/issues/3572
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-02-11 12:02:30 +01:00
Sébastien Han c69c8c9ac1 mon: do not hardcode ceph uid
167 is the ceph uid for Red Hat based system, thus trying to deploy a
monitor on Debian fail since the ceph user id on that system is 64045.
This commit uses the ceph_uid variable which contains the right uid
based on system/container detection.

Closes: https://github.com/ceph/ceph-ansible/issues/3589
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-02-11 09:09:40 +00:00
Leah Neukirchen 4fe7f37849 Fix uses of default(omit) with string concatenation
When {{omit}} is concatenated with another string, it expands to something
like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d.
However, in these use-cases we need an empty string.

Regression introduced in d53f55e807.

Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>
2019-02-08 16:18:15 +00:00
Patrick C. F. Ernzer c605ff6a68 setup_ntp: call handler to disable ntpd if chronyd used
The task setup chronyd called the handler disable chronyd, which of
course defeats the purpose.

Changing the task to disable ntpd instead fixes the issue of chronyd
being disabled after it got enabled.

Fixes: #3582

Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com
2019-02-08 12:04:44 +01:00
Guillaume Abrioux d4b3c1d409 iscsi-gws: remove a leftover
remove leftover introduced by 9d590f4

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-08 01:11:42 +01:00
Guillaume Abrioux 9d590f4339 iscsi: fix permission denied error
Typical error:
```
fatal: [iscsi-gw0]: FAILED! =>
  msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'''
```

`become: True` is not needed on the following task:

`copy crt file(s) to gateway nodes`.

Since it's already set in the main playbook (site.yml/site-container.yml)

The thing is that the files get generated in the 'fetch_directory' with
root user because there is a 'delegate_to' + we run the playbook with
`become: True` (from main playbook).

The idea here is to create files under ansible user so we can open them
later to copy them on the remote machine.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-07 17:57:22 +01:00
Sébastien Han bb2bbeb941 osd: container remove --pid=host
Let's try again with the Nautilus release.

Closes: https://github.com/ceph/ceph-ansible/issues/1297
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-02-07 12:13:51 +00:00
John Fulton cc0bf197e1 Fix CNI error when net=host is not used on OSD calls
Follow up fix that 410abd7 missed.

Related: ceph#3561

Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 22:49:01 +00:00
John Fulton 719a25b571 Create Ceph Initial Dirs earlier
Include tasks from create_ceph_initial_dirs earlier during
ceph config role.

Fixes: #3568
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 18:38:05 +00:00
John Fulton dab3f6ee3f Fix CNI error when net=host is not used in some podman calls
With 'podman version 1.0.0' on RHEL8 beta the 'get ceph version' and
'ceph monitor mkfs' commands fail [1] with "error configuring network
namespace for container Missing CNI default network".

When net=host is added these errors are resolved. net=host is used in
many other calls (grep -R net=host | wc -l --> 38).

Fixes: #3561
Signed-off-by: John Fulton <fulton@redhat.com>
(cherry picked from commit 410abd7745)
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 914d94cae8 set RuntimeDirectory in all systemd unit templates
/var/run/ceph resides in a non persistent filesystem (tmpfs)
After a reboot, all daemons won't start because this directory will be
missing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 7ade032807 osd: bind mount /var/run/udev/
without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux fdca29f2a7 facts: set timeout_command fact in ceph-defaults
- also add `--foreground` which seems to fix some issue we are facing when
using timeout with `podman`.
- use this fact in the `is ceph running already?` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 16efdbc59b podman: support podman installation on rhel8
Add required changes to support podman on rhel8

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1667101

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
John Fulton 37b5d1084a Make python print statements python3 compatible
The restart_osd_daemon.sh generated from the j2 template
contains a python call which uses 'print x' instead of
'print(x)'. Add the missing parentheses to make this call
compatible with both 2 and 3.

Also add parentheses to other python print calls found
in roles/ceph-client/defaults/main.yml and
infrastructure-playbooks/cluster-os-migration.yml.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1671721
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-01 15:23:27 +00:00
Andrew Schoen 70a4368bc5 ceph-config: do not always assume containers when calculating num_osds
CEPH_CONTAINER_IMAGE should be None if containerized_deployment
is False.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 88eda479a9 ceph-facts: generate devices when osd_auto_discovery is true
This task used to live in ceph-osd, but we need it defined here to that
ceph-config can use it when trying to determine the number of osds.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
John Fulton cba9b23363 Do not timeout podman/docker pull if timeout value is '0'
If user sets "docker_pull_timeout: '0'" then do not use the
timeout command when running podman/docker pull. Also, use
"timeout -s KILL"; without KILL, podman on RHEL8 beta does
not timeout and deployment can hang.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1670625
Signed-off-by: John Fulton <fulton@redhat.com>
2019-01-31 15:35:09 +00:00
Guillaume Abrioux fe1528adb4 config: support num_osds fact setting in containerized deployment
This part of the code must be supported in containerized deployment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664112

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 14:30:23 +00:00