Commit Graph

2618 Commits (44e1ebaaff61b5317bae0a3483f5d485c12bd18b)

Author SHA1 Message Date
Dimitri Savineau 44e1ebaaff ceph-nfs: add stable noarch repository
When using the stable nfs ganesha repository, we need have both arch
and noarch repositories enabled.
Currently the noarch repository is missing which cause the non
containerized deployment to fail.

Closes: #5375

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-16 07:34:08 +02:00
Guillaume Abrioux af9f6684f2 common: introduce ceph_pool module calls
This commits calls the `ceph_pool` module for creating ceph pools
everywhere it's needed in the playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-16 07:31:57 +02:00
Guillaume Abrioux 8c7a48832c common: fix target_size_ratio task enablement
The condition on this task is wrong, we have to check whether
`target_size_ratio` is set in the pool definition instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-15 20:57:32 +02:00
Guillaume Abrioux e5e81843e9 facts: always set ceph_run_cmd and ceph_admin_command
always set these facts on monitor nodes whatever we run with `--limit`.
Otherwise, playbook will fail when using `--limit` on nodes where these
facts are used on a delegated task to monitor.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-15 10:53:15 +02:00
Dimitri Savineau 252e78b4e4 docker2podman: manage dashboard nodes
The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 12:02:00 +02:00
Dimitri Savineau b20519efd0 dashboard: allow disabling grafana api ssl verify
When using an untrusted TLS certificate (like self-signed) on grafana
then the grafana dashboards update subcommand will fail.
One solution could be to trust the TLS certificate.
The other one is to disable the TLS verification on the grafana API.

Closes: #5324

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 11:56:57 +02:00
Dimitri Savineau 222fe4abd8 ceph-nfs: bind mount ganesha log directory
The current ganesha log directory is only present in the container
and not bind mount on the host.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 11:55:38 +02:00
Benoît Knecht 444b46ea24 ceph-validate: Expand templates in rgw_create_pools
Same fix as `ceph-rgw` for `rgw_create_pools` pool names that contain Jinja
templates.

See #5348 for details.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-05-11 11:51:27 -04:00
Benoît Knecht d2b7670c7d ceph-rgw: Make sure pool name templates are expanded
It is common to set templated pool names in `rgw_create_pools`, e.g.

```yaml
rgw_create_pools:
  "{{ rgw_zone }}.rgw.buckets.index":
    pg_num: 16
    size: 3
    type: replicated
```

This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in
the way `with_dict` works [1].

This commit replaces the use of `with_dict` with

```yaml
loop: "{{ rgw_create_pools | dict2items }}"
```

which works as intended and expands the template in the pool name.

[1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops

Closes #5348

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-05-11 11:51:27 -04:00
Benoît Knecht b7efca1785 ceph-validate: Fix "fail on unsupported CentOS release"
The `dashboard_enabled` condition used a `true` filter (which doesn't exist)
instead of the `bool` filter.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-05-08 10:21:11 -04:00
Dimitri Savineau 34e6e8e06c ceph-rgw: use match instead of equalto from jinja2
The '==' jinja2 operator (or 'equalto') has been introduced in jinja2
2.8.
On EL7, jinja2 version is 2.7 so the operator isn't present creating
templating error like:

The error was: TemplateRuntimeError: no test named '=='

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-06 14:23:10 -04:00
Dimitri Savineau 8a890306ad ceph-nfs: fix internal ganesha deployment
Since ea2b654d9 we're not running the rados command from the monitor
nodes but from the ganesha node. Unfortunately we don't have the
required keyring on that node to run the rados command as we don't
import the right keyring.
This commit restores the workflow for internal ganesha deployment like
before ea2b654d9 but keeps the rados commands from the ganesha node for
external deployment until we have a better design.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-06 11:10:08 -04:00
Dimitri Savineau 748ac4b928 ceph-nfs: fix keyring copy for external ganesha
Fix the condition on the keyring copy task that prevent the ganesha
keyring to be created in the /var/lib/ceph directory.
Also ensure that the directory exists first.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-06 11:10:08 -04:00
Guillaume Abrioux cf460274c7 nfs: fix 2 typo
The condition is missing an index here which makes the playbook failing.

Typical error:
```
The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'",
```

Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-06 11:10:08 -04:00
Dimitri Savineau ed4f23d530 ceph-facts: fix IPv6 _radosgw_address interface
When using radosgw_interface and IPv6 setup then the _radosgw_address
fact doesn't use square brackets compared to the radosgw_address and
radosgw_address_block configuration.

Closes: #5325

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-28 14:35:16 -04:00
fmount 5eb363e033 Refresh ceph dashboard user role
This change allows the operator to refresh the
ceph dashboard admin role on multiple ceph-ansible
executions.
In the current state the role is set only when the
user is created, and there's no way to change it if
the user exists.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826002
Signed-off-by: fmount <fpantano@redhat.com>
2020-04-23 16:28:49 -04:00
Dimitri Savineau f1728929cd ceph-dashboard: fix mgr dashboard IPv6 fact
15ed9ee introduced a regression for the mgr dashboard daemon using
IPv6 since the mgr dashboard configuration doesn't support brackets.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827299

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 14:44:46 -04:00
Dimitri Savineau 2547ab601a Readd CentOS 7 with conditions
The CentOS 7 distribution could still be used be deploying ceph if
  - it's a containerized deployment
  - it's a non containerized deployment without the dashboard (due to
missing python3 libraries).

The ceph_stable_redhat_distro variable has been remove because we can
rely on the ansible_distribution_major_version fact instead.

The copr el8 repository configuration is only applied for CentOS 8.

The ceph-mgr-dashboard package is only installed when the
dashboard_enabled variable is set to true.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 13:31:11 +02:00
Guillaume Abrioux 86dc6f8206 mds: don't enable application pool on cephfs pools
this commit removes the task which enable application on cephfs pools.

See: https://tracker.ceph.com/issues/43761

Fixes: #5278

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-23 13:23:10 +02:00
ianwatsonrh ccf6a7f153 typo: updating type check on rc
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826884
Signed-off-by: ianwatsonrh <ianwatson@redhat.com>
2020-04-23 13:20:35 +02:00
abaird-rh eb71244bfd Updated use of deprecated filter
This was removed in Ansible 2.9.

[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
using `result|version_compare` use `result is version_compare`. This
feature will be removed in version 2.9. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.

Rename 'version_compare' to the function 'version'.

version_compose was renamed to version since ansible 2.5

Signed-off-by: abaird-rh <abaird@redhat.com>
2020-04-20 15:29:29 +02:00
Guillaume Abrioux 378405e328 mds: fix --limit run against mds nodes
This commit fixes --limit runs against mds nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-14 10:42:43 -04:00
Guillaume Abrioux ea2b654d95 nfs: create empty rados index object for nfs standalone
This commit creates an empty rados index object even when deploying
standalone nfs-ganesha.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-14 10:40:37 -04:00
Dimitri Savineau 5de74fe512 ceph-validate: update RHEL requirement for RHCS
We were not testing the right ansible_distribution fact value for RHEL
distribution.
This commit also updates the minial RHEL version supported by RHCS.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-09 20:43:22 +02:00
Guillaume Abrioux 4bcc52cb2a osd: fix monitor_name error when scaling out OSDs
This commit fixes a bug when trying to scale out osd nodes with
`crush_rule_config` is enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-09 13:46:40 -04:00
Paulo Matias 38ce02c2ea Allow user to specify grafana_server_fqdn
This is needed to get a TLS certificate to validate correctly.

If unspecified, auto-detected grafana_server_addr is used.

Signed-off-by: Paulo Matias <matias@ufscar.br>
2020-04-07 20:51:23 +02:00
Paulo Matias dac8e1d0a9 Prometheus APIs are only available through plain http
Trying to access these APIs through TLS produces "Could not reach
external API" errors in Ceph dashboard.

Signed-off-by: Paulo Matias <matias@ufscar.br>
2020-04-07 20:51:23 +02:00
Matthew Vernon 7963a76c7a Use a tempfile directory to store restart scripts
Make a tempfile directory and copy the restart scripts there (and then
execute them from there), rather than using insecure known filenames
in /tmp/

This is a partial fix for ceph/ceph-ansible#2937

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
2020-04-06 22:55:51 +02:00
Dimitri Savineau 6617d90733 ceph-mgr: add saml python lib for dashboard SSO
The dashboard SSO mgr module requires the saml python library to be
installed. This is only a valid scenario for RHCS deployment because
the saml python library isn't available in other classic repositories.
This package is present in RHCS Tools repository so we also need to
enable it on the mgr nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-06 10:11:00 -04:00
Guillaume Abrioux 1bb9860dfd osd: use default crush rule name when needed
When `rule_name` isn't set in `crush_rules` the osd pool creation will
fail.
This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with
the default crush rule name.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-31 14:49:38 -04:00
Guillaume Abrioux 8c1c34b201 tests: add more coverage in external_clients scenario
Run create_users_keys.yml in external_clients scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-31 14:49:38 -04:00
Guillaume Abrioux 5b0476385c osd: support changing default rule even when osd_crush_location isn't defined
Creating crush rules even with no crush hierarchy configuration is a
valid scenario so we shouldn't be bound to the first task result (which
configure crush hierarchy) to be able to add new crush rules.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-31 14:26:48 -04:00
Dimitri Savineau 64701437de container: remove ulimit nofile parameter
Since Ceph Octopus is python3 only we don't need to specify the max open
files anymore with the container engine.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-30 09:54:23 +02:00
Dimitri Savineau 4ac99223b2 rhcs: drop debian support
Support for debian with RHCS has been dropped starting RHCS 4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-27 04:36:36 +01:00
Dimitri Savineau 90ad110861 rhcs: update release to 5 for octopus
RHCS 5 will be based on Ceph Octopus release and only supported on
RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-26 22:00:08 +01:00
Guillaume Abrioux e551b5ba1a defaults: remove legacy comment
This is no longer true, let's remove this comment given that this option
is not ignored in containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-26 09:19:14 -04:00
Guillaume Abrioux b7ada14cf5 defaults: change nfs_ganesha_stable_branch
In master, even though we are using dev repo, the value here should be closer
from the last stable released.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-25 22:30:15 +01:00
Dimitri Savineau 706de944cf ceph-defaults: update ceph_stable_redhat_distro
Since octopus the ceph_stable_redhat_distro variable should be set to
el8 instead of el7.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-25 21:00:24 +01:00
Dimitri Savineau 0487d21938 ceph-facts: fix rgw_instances_all fact
The rgw_instances_all fact is supposed to be the list of all radosgw
instances from all rgw nodes.
But the fact is always using the local rgw_instances variable so this
won't work on multiple nodes.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-25 08:02:13 +01:00
Guillaume Abrioux 83fdf24caf doc/tests: bump to ansible 2.9 on master
Add testing against ansible 2.9 on master branch.
This commit also updates the documentation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-25 08:01:27 +01:00
Guillaume Abrioux 1b0b7af119 osd: add a default value for 'default' in crush_rules
Let's default to `False` for the `default` attribute in `crush_rules`
variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797774

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-24 08:41:43 -04:00
Dimitri Savineau df8f853c85 Add pacific release
Add the 16th ceph release: pacific.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 09:47:12 +01:00
Guillaume Abrioux 1a7f3caecb facts: fix typo
This commit fixes a typo in some task titles

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-23 14:03:52 -04:00
Guillaume Abrioux cc28d9ec26 nfs: fix nfs with external ceph cluster support
This commit refact and fix the nfs deployment with external ceph cluster
support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-19 18:21:16 -04:00
Dimitri Savineau fb69f6990c dashboard: allow to set read-only admin user
This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 15:34:41 +01:00
Dimitri Savineau 5051e67f8f ceph-defaults: add registry name on dashboard vars
We don't use the registry name when using the community dashboard
container images (grafana, prometheus, alertmanager & node exporter).
This commit adds the docker.io registry explicitly in the default
dashboard container image name values.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 14:27:50 +01:00
Dimitri Savineau b97a4d5201 ceph-defaults: update grafana container tag
Since 8e8aa73 we're using grafana 5.4.3 in RHCS 4.1 via [1].
We should also update the grafana container tag from docker.io when
using the community release.

[1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-17 14:35:06 +01:00
petruha 73b3fadb0e ceph-facts: Fix system_secret_key variable handling
This commit fixes the system_secret_key variable not substitued by the
right value and always using the 'system_secret_key' string instead.

$ egrep 'system_(access|secret)_key' group_vars/all.yml
system_access_key: foofoofoofoofoofoofo
system_secret_key: barbarbarbarbarbarbarbarbarbarbarbarbarb

$ ansible-playbook -vv -i hosts site.yml.sample -e rgw_multisite=true
(...)
  - hostname: storage0
    endpoint: http://192.168.100.42:8080
    instance_name: rgw0
    radosgw_address: 192.168.50.3
    radosgw_frontend_port: 8085
    rgw_realm: canada
    rgw_zone: montreal
    rgw_zone_user: justin.trudeau
    rgw_zone_user_display_name: Justin Trudeau
    rgw_zonegroup: quebec
    system_access_key: foofoofoofoofoofoofo
    system_secret_key: system_secret_key

Fixes https://github.com/ceph/ceph-ansible/issues/5150

Signed-off-by: petruha <5363545+p37ruh4@users.noreply.github.com>
2020-03-16 17:38:52 -04:00
Guillaume Abrioux 152c2caa9f config: remove legacy option in ceph.conf.j2
This option has been deprecated (As of 0.51).
By the way, ceph-ansible already sets the
auth_{service,client,cluster}_required variables.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623586

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-16 09:49:36 -04:00
Dimitri Savineau 3626c688cf handler: add rgw multi-instances support
This commit adds the rgw multi-instances support in ceph-handler
(restart_rgw_daemons.sh.j2)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-12 16:44:48 -04:00