Commit Graph

2686 Commits (ed6ae6815d665069c9b6f7d790390dceb0f037c9)

Author SHA1 Message Date
Seena Fallah 10fc2d1d92 ceph-facts: add get default crush rule from running monitor
In case of deploying new monitor node to an existing cluster,
osd_pool_default_crush_rule should be taken from running monitor because
ceph-osd role won't be run and the new monitor will have different
osd_pool_default_crush_role from other monitors.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ff9f4d138f)
2020-09-29 12:15:09 -04:00
Ali Maredia 9f58d4a3d1 rgw multisite: check connection for realm endpoint
This commit adds connection checks before realm pulls
Curls are performed on the endpoint being pulled from
the mons and the rgws

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731158

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 902575369c)
2020-09-29 09:24:59 -04:00
Tyler Bishop e3284b20ac facts: support device aliases for (dedicated|bluestore_wal)_devices
Just likve `devices`, this commit adds the support for linux device aliases for
`dedicated_devices` and `bluestore_wal_devices`.

Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>
(cherry picked from commit ee4b8804ae)
2020-09-29 09:24:17 -04:00
Dimitri Savineau 6294244c4f ceph-handler: set handler on xxx_stat result
In non containerized deployment we check if the service is running
via the socket file presence.
This is done via the xxx_socket_stat variable that check the file
socket in the /var/run/ceph/ directory.
In some scenarios, we could have the socket file still present in
that directory but not used by any process.
That's why we have the xxx_stat variable which clean those leftovers.

The problem here is that we're set the variable for the handlers status
(like handler_mon_status) based on xxx_socket_stat instead of xxx_stat.
That means we will trigger the handlers if there's an old socket file
present on the system without any process associated.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 733596582d)
2020-09-29 09:23:45 -04:00
Dimitri Savineau 77c5115ad2 ceph-iscsi: create pool once from monitor
af9f6684 introduced a regression on the ceph iscsi pool creation
because it was delegated to the first monitor node before that change.
This patch restores the initial worflow.
When the iscsi node doesn't have the admin keyring then the pool
creation fails.
This commit also ensures that the pool creation is only executed once
when having multiple iscsi nodes.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 501b8e0fd3)
2020-09-29 09:23:14 -04:00
Dimitri Savineau babc8a05fd ceph-mds: remove unused block condition
Since af9f6684 the cephfs pool(s) creation don't use the fs_pools_created
variable anymore because the ceph_pool module is idempotent.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1db4dc807c)
2020-09-28 20:39:25 -04:00
Seena Fallah 9b0f45431d ceph-facts: check for mon socket in its own host
delegate to its own host after checking mon socket to findout if mon socket is in-use or not.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 69f7e35382)
2020-09-28 20:38:52 -04:00
Guillaume Abrioux 5538dd8b3b Revert "ceph-rgw: remove ceph_pool state and default value"
This reverts commit ba3512a8fc.

(cherry picked from commit bf7b044c9a)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-09-28 16:47:45 -04:00
Dimitri Savineau 7e2e11320d ceph-rgw: remove ceph_pool state and default value
Since the state is now optional and default values are handled in the
ceph_pool module itself.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ba3512a8fc)
2020-09-25 14:05:34 -04:00
Dimitri Savineau 8d49d97582 ceph-config: remove ceph_release from ceph.conf.j2
We don't use ceph_release variable in the ceph.conf jinja template.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 62bd41f0d4)
2020-09-25 13:37:36 -04:00
Dmitriy Rabotyagov f996a232d7 Remove libjemalloc1 installation task
libjemalloc1 package is not required neither for ganesha dependency nor
for the package build process. So this task can be simply dropped.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
(cherry picked from commit 297532ca41)
2020-09-24 14:02:27 -04:00
Guillaume Abrioux b714e04d91 facts: refact `ceph_uid` fact
There's no need to set this fact with a `set_fact`
We can achieve this in `ceph-defaults`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bcc673f66c)
2020-09-21 12:26:19 -04:00
Dimitri Savineau c0654fa37a container: quote registry password
When using a quote in the registry password then we have the following
error:

The error was: ValueError: No closing quotation

To fix this we need to use the quote filter.

Close: https://bugzilla.redhat.com/show_bug.cgi?id=1880252

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6dcfdf17d4)
2020-09-18 14:25:44 -04:00
Guillaume Abrioux 7e29b22c76 facts: fix 'set_fact rgw_instances with rgw multisite'
the current condition doesn't work, as soon as the first iteration is
done the condition makes next iterations skip since `rgw_instances` got
set with the first iteration.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ff19c1d851)
2020-09-18 10:35:20 -04:00
Dimitri Savineau fb43400903 ceph-infra: include iscsi nodes for logrotate
The iscsi nodes aren't included in the logrotate condition.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 85643edfe3)
2020-09-17 14:49:35 -04:00
Guillaume Abrioux 3e53d2a811 infra: support log rotation for tcmu-runner
This commit adds the log rotation support for tcmu-runner.

ceph-container related PR: ceph/ceph-container#1726

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1873915

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f576c02ff7)
2020-09-16 21:35:48 -04:00
Dimitri Savineau 8f175eb60c container: add optional http(s) proxy option
When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
  1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
  2/ podman uses the environment variables so we need to add them to
the login/pull tasks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bda3581294)
2020-09-16 11:32:14 -04:00
Dimitri Savineau 95a073cb3b ceph-prometheus: update pool stat counter
Since [1] The bytes_used pool counter in prometheus has been renamed
to stored.

Closes: #5781

[1] https://github.com/ceph/ceph/commit/71fe9149

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e54b924eaf)
2020-09-16 10:08:46 -04:00
Dimitri Savineau 25ba7f5314 node-exporter: exclude client nodes
We don't need to install node-exporter on client node because there's
no ceph services running on them.
This also makes sure we use the group name variables in the prometheus
service template instead of hardcoding the values.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b105549ed8)
2020-09-14 16:13:11 -04:00
Dimitri Savineau 24698e7f4b dashboard: use run_once at block level
Instead of using run_once: true on each tasks in a block section, we
can use the run_once statement at the block level.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2c4af70abd)
2020-09-14 15:54:22 -04:00
Dimitri Savineau 23522a11e4 ceph_key: set state as optional
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit abb4023d76)
2020-09-14 15:37:56 -04:00
Dimitri Savineau e785654632 ceph_pool: set state as optional
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3a05aeb6cb)
2020-09-14 15:37:56 -04:00
Dimitri Savineau 6fd6b31305 library: add ceph_dashboard_user module
This adds the ceph_dashboard_user ansible module for replacing the
command module usage with the ceph dashboard ac-user-xxx command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ee6f0547ba)
2020-09-11 09:08:56 -04:00
Guillaume Abrioux 828817489c facts: refact and optimize memory consumption
there's no need to run this task on all nodes.
This uses too much memory for nothing.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f0fe193d8e)
2020-09-10 22:52:38 -04:00
Dimitri Savineau 593264e5f7 ceph-rgw: use ceph_pool module
Since [1] we can use the ceph_pool module instead of using the command
module combined with ceph osd pool commands.

[1] bddcb439ce

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8dacbce68f)
2020-09-10 21:44:19 -04:00
Dimitri Savineau 0c0a930374 ceph-facts: only get fsid when monitor are present
When running the rolling_update playbook with an inventory without
monitor nodes defined (like external scenario) then we can't retrieve
the cluster fsid from the running monitor.
In this scenario we have to pass this information manually (group_vars
or host_vars).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f63022dfec)
2020-09-10 20:57:16 +02:00
Niko Smeds a41c572785 Enable HAProxy backend checks for Ceph RGW
Add the `check` option to server definitions to enable basic HAProxy health
checks for Ceph RADOS gateway backends.

Currently traffic will be forwarded to unhealthly `radosgw.service` servers.
These changes resolve the issue.

Signed-off-by: Niko Smeds nikosmeds@gmail.com
(cherry picked from commit a951c1a3f0)
2020-09-02 09:54:52 -04:00
Guillaume Abrioux 8fde5f7396 dashboard: refact admin user creation task
this commit splits this task in order to avoid using a `shell` module.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 54d3e9650f)
2020-08-21 13:55:54 +02:00
George Shuklin c0d98878ff Make 'disable ssl for dashboard task' idempotent.
This should reduce number of 'changed' tasks during convergence test.

Signed-off-by: George Shuklin <george.shuklin@gmail.com>
(cherry picked from commit 73d4bb6bd6)
2020-08-20 17:16:45 +02:00
Rafał Wądołowski 21a37e23b3 Comment out ceph_custom_key
Since there is a check if ceph_custom_key is defined, there is no reason
to define it by default.

Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
(cherry picked from commit 55cd6e83e4)
2020-08-20 13:43:44 +02:00
John Fulton 489efd5689 Set default permission for prometheus config files
Regardless of the outcome of Ansible 2.9.12 issue 71200
we can set a default permission for these files.

Closes: https://github.com/ceph/ceph-ansible/issues/5677

Signed-off-by: John Fulton <fulton@redhat.com>
(cherry picked from commit 95dee6f1ca)
2020-08-18 18:04:17 -04:00
Guillaume Abrioux 3fad1677d6 infra: only install logrotate on right nodes
For intsance, there is no need to install logrotate on clients nodes.

This also ensure logrotate is installed only for containerized
deployments since the packaging has an explicit dependency to logrotate

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8ed11ea3ee)
2020-08-18 11:10:57 -04:00
Dimitri Savineau e9c6028eb9 ceph-rgw: allow specifying crush rule on pool
We already support specifiying a custom crush rule during pool creation
in ceph-osd role but not in ceph-rgw role.
This patch adds the missing code to implement this feature.
Note this is only available for replicated pool not erasure. The rule
must also exist prior the pool creation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cb8f0237e1)
2020-08-17 23:00:13 +02:00
Ali Maredia 63d991dc3d rgw: allow rgws to be concurrently with or without multisite
Allows rgws in a ceph cluster to be run with
multisite and without multisite at the same time.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 5c1f4b1a1e)
2020-08-17 13:56:45 +02:00
Guillaume Abrioux 2609da6ce7 infra: add missing tag
This commit adds the missing `with_pkg` tag on the logrotate
installation task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e1cb385740)
2020-08-13 10:09:31 -04:00
Guillaume Abrioux 29d4c42f80 infra: add log rotation support (containers)
This commit adds the log rotation support via logrotate in containerized
deployments.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f1aa6cea21)
2020-08-12 22:57:10 +02:00
Guillaume Abrioux 8a7e4193db common: don't enable debug log on ceph-volume calls by default
ceph-volume can generate large logs at some point.

debug logs by definition should be enabled only when debugging.

Let's make it customizable with a variable which is set to `False` by
default.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 448cc280b7)
2020-08-12 22:57:10 +02:00
Guillaume Abrioux 223254e8bf nfs: do not copy rgw keyring when `nfs_obj_gw` is true
This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the
cluster doesn't contain a rgw node, which can be the case given we are
using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the
deployment will fail trying to copy a key that doesn't exist.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dd4b5b0328)
2020-08-12 14:57:56 -04:00
raul 5fc3af5f4d rgw: support 1+ rgw instance in `radosgw_frontend_port`
Change the radosgw_frontend_port to take in account more than 1 RGW instance,
in it's original form `radosgw_frontend_port: radosgw_frontend_port | int`,
it configured the 8080 port to all instances, with the following modification
`radosgw_frontend_port: radosgw_frontend_port | int + item|int` we increase in
1 the port count.

Co-authored-by: Daniel Parkes <dparkes@redhat.com>
Signed-off-by: raul <rmahique@redhat.com>
(cherry picked from commit 110eaf5f9f)
2020-08-12 14:57:35 -04:00
Guillaume Abrioux e0dc56b73c config: only add related rgw section
there's no need to add each rgw section on all rgw nodes.
With this commit, only related rgw section are rendered.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0a581a6e60)
2020-08-05 09:50:09 -04:00
Dimitri Savineau f9a24e2541 dashboard: allow remote TLS cert/key copy
When using TLS on the ceph dashboard or grafana services, we can provide
the TLS certificate and key.
Those files should be present on the ansible controller and they will be
copyied to the right node(s).
In some situation, the TLS certificate and key could be already present
on the target node and not on the ansible controller.
For this scenario, we just need to copy the files locally (on each remote
host).

This patch adds the dashboard_tls_external variable (with default to
false) to allow users to achieve this scenario when configuring this
variable to true.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0d0f1e71df)
2020-08-04 14:01:59 +02:00
Dimitri Savineau 56cf7168fa ceph-handler: remove iscsigws restart scripts
The iscsigws restart scripts for tcmu-runner and rbd-target-{api,gw}
services only call the systemctl restart command.
We don't really need to copy a shell script to do it when we can use
the ansible service module instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cbe79428e6)
2020-07-25 09:34:25 +02:00
Dimitri Savineau 2faed4c204 podman: always remove container on start
In case of failure, the systemd ExecStop isn't executed so the container
isn't removed. After a reboot of a failed node, the container doesn't
start because the old container is still present in created state.
We should always try to remove the container in ExecStartPre for this
situation.
A normal reboot doesn't trigger this issue and this also doesn't affect
nodes running containers via docker.
This behaviour was introduced by d43769d.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 47b7c00287)
2020-07-24 12:47:01 -04:00
Dimitri Savineau d5974086dd ceph-facts: remove mds_name fact
The mds_name fact always gets the ansible_hostname value so we don't
need to have a dedicated fact for this and use the ansible_hostname fact
instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4e84b4beed)
2020-07-24 10:50:44 -04:00
Dimitri Savineau c694454f82 ceph-handler: add missing condition on ceph-crash
The ceph-crash tasks present in the ceph-handler role don't need to be
executed on all nodes.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 18e3c7a0a2)
2020-07-22 18:47:01 -04:00
Guillaume Abrioux c0b32e4a79 crash: rm container in ExecPreStart even with docker
We should ensure the container is removed in `ExecPreStart` even when
`{{ container_binary }}` is docker.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 39bb279a53)
2020-07-22 18:47:01 -04:00
Guillaume Abrioux e6059fdcd3 ceph-crash: introduce new role ceph-crash
This commit introduces a new role `ceph-crash` in order to deploy
everything needed for the ceph-crash daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9d2f2108e1)
2020-07-22 18:47:01 -04:00
Guillaume Abrioux bd12158a1c facts: fix broken facts when using --limit
This commit fixes these tasks when --limit is used.

It makes sure the fact is set on right nodes even when the playbook is
run with `--limit`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f8a951f50c)
2020-07-20 22:49:43 -04:00
Dimitri Savineau b11eeed833 ceph-dashboard: copy TLS cert/key on monitor
The ceph-dashboard role is executed on the mgr nodes so the TLS cert/key
files are copied to those nodes.
But we are running importing the cert/key files into the ceph
configuration on the monitor.

Closes: #5557

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2b8ebf1457)
2020-07-20 22:49:20 -04:00
Guillaume Abrioux c0b8edfea2 rgw: set container memory limit to 4g
This commit changes the container memory limit for rgw daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86edae724f)
2020-07-10 10:10:32 -04:00