Commit Graph

630 Commits (wip-pr-6207-docs-update)

Author SHA1 Message Date
Dimitri Savineau aa6e1f20ea Revert "config: Always use osd_memory_target if set"
This reverts commit 4d1fdd2b05.

This breaks the backward compatibility with previous osd_memory_target
calculation and we could have a value lower than the minimum value allowed
(896M) which causes some ceph commands to fail (like ceph assimilate-conf).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-12-12 06:56:32 +01:00
Dimitri Savineau 5a41026347 monitoring: use config_template module for config
The alertmanager, grafana and prometheus configuration file are
generated with the template module which doesn't allow for using
config overrides.
Instead we could use the config_template plugin action and add a
new variable for overrides (one for each component).

With this patch, one should be able to add configuration to
prometheus with the following:

---
alertmanager_conf_overrides:
  global:
    smtp_smarthost: 'localhost:25'
...

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902999

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-12-12 06:55:27 +01:00
Dimitri Savineau 0078230f1d rhcs: default to containerized deployment
Starting RHCS 5, only containerized deployment is available.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-11-25 10:46:16 +01:00
Gaudenz Steinlin 4d1fdd2b05 config: Always use osd_memory_target if set
The osd_memory_target variable was only used if it was higher than the
calculated value based on the number of OSDs. This is changed to always
use the value if it is set in the configuration. This allows this value
to be intentionally set lower so that it does not have to be changed
when more OSDs are added later.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
2020-11-13 09:13:58 +01:00
Guillaume Abrioux 5cadfea42e dashboard: change dashboard_grafana_api_no_ssl_verify default value
This sets the `dashboard_grafana_api_no_ssl_verify` default value
according to the length of `dashboard_crt` and `dashboard_key`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-11-04 10:00:48 +01:00
Guillaume Abrioux 767d3c898e dashboard: enable https by default
see linked bz for details

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1889426

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-11-04 10:00:48 +01:00
Guillaume Abrioux 1cc9666c09 common: drop `fetch_directory` feature
This commit drops the `fetch_directory` feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-10-21 13:22:16 +02:00
Guillaume Abrioux 900c0f4492 ceph-config: ceph.conf rendering refactor
This commit cleans up the `main.yml` task file of `ceph-config`.
It drops the local ceph.conf generation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-10-21 13:22:16 +02:00
Guillaume Abrioux c101cb3931 defaults: change defaults value
this commit changes defaults value in default pool definitions.

there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`,
`ceph_pool` module will use the current default if needed.

This also drops the 3 following `set_fact` in `ceph-facts`:

- osd_pool_default_pg_num,
- osd_pool_default_pgp_num,
- osd_pool_default_size_num

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-10-02 07:42:40 +02:00
Guillaume Abrioux eefe11d90c defaults: change default grafana-server name
This change default value of grafana-server group name.
Adding some tasks in ceph-defaults in order to keep backward
compatibility.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-09-29 07:42:26 +02:00
Dimitri Savineau e11453c6f5 Remove unused centos docker tasks
The `enable extras on centos` task just doesn't work when using the
variable ceph_docker_enable_centos_extra_repo to true.

fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter
'baseurl', 'metalink' or 'mirrorlist' is required."}

The CentOS extras repository is enabled by default so it's pretty
safe to remove this task and the associated variable.

This also removes the ceph_docker_on_openstack variable as it's a
leftover and it is unused.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-29 07:35:10 +02:00
Dimitri Savineau bda3581294 container: add optional http(s) proxy option
When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
  1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
  2/ podman uses the environment variables so we need to add them to
the login/pull tasks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-16 06:52:26 +02:00
Rafał Wądołowski 55cd6e83e4 Comment out ceph_custom_key
Since there is a check if ceph_custom_key is defined, there is no reason
to define it by default.

Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
2020-08-20 13:36:24 +02:00
Dimitri Savineau cb8f0237e1 ceph-rgw: allow specifying crush rule on pool
We already support specifiying a custom crush rule during pool creation
in ceph-osd role but not in ceph-rgw role.
This patch adds the missing code to implement this feature.
Note this is only available for replicated pool not erasure. The rule
must also exist prior the pool creation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-08-17 22:59:06 +02:00
Guillaume Abrioux 448cc280b7 common: don't enable debug log on ceph-volume calls by default
ceph-volume can generate large logs at some point.

debug logs by definition should be enabled only when debugging.

Let's make it customizable with a variable which is set to `False` by
default.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-11 15:03:20 +02:00
Dimitri Savineau 0d0f1e71df dashboard: allow remote TLS cert/key copy
When using TLS on the ceph dashboard or grafana services, we can provide
the TLS certificate and key.
Those files should be present on the ansible controller and they will be
copyied to the right node(s).
In some situation, the TLS certificate and key could be already present
on the target node and not on the ansible controller.
For this scenario, we just need to copy the files locally (on each remote
host).

This patch adds the dashboard_tls_external variable (with default to
false) to allow users to achieve this scenario when configuring this
variable to true.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-08-03 13:39:47 +02:00
Guillaume Abrioux d490968fc8 defaults: remove legacy
These variables aren't consummed anywhere else than in ceph-nfs role so
there is no need to have them in `ceph-defaults`'s defaults

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-21 09:39:15 +02:00
Guillaume Abrioux 86edae724f rgw: set container memory limit to 4g
This commit changes the container memory limit for rgw daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-09 15:31:10 +02:00
Guillaume Abrioux bcc673f66c facts: refact `ceph_uid` fact
There's no need to set this fact with a `set_fact`
We can achieve this in `ceph-defaults`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-09 13:37:29 +02:00
Dimitri Savineau 93754bd70c ceph-defaults: update nfs-ganesha to 3.3
nfs-ganesha 3.3 is the latest 3.x release available for octopus so we
should update to this version.

https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus

This will also match the version used in RHCS 5.

Ceph container already uses that version too.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-03 06:36:49 +02:00
Dimitri Savineau 829990e60d ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-18 17:51:13 +02:00
Dimitri Savineau b20519efd0 dashboard: allow disabling grafana api ssl verify
When using an untrusted TLS certificate (like self-signed) on grafana
then the grafana dashboards update subcommand will fail.
One solution could be to trust the TLS certificate.
The other one is to disable the TLS verification on the grafana API.

Closes: #5324

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 11:56:57 +02:00
Dimitri Savineau 2547ab601a Readd CentOS 7 with conditions
The CentOS 7 distribution could still be used be deploying ceph if
  - it's a containerized deployment
  - it's a non containerized deployment without the dashboard (due to
missing python3 libraries).

The ceph_stable_redhat_distro variable has been remove because we can
rely on the ansible_distribution_major_version fact instead.

The copr el8 repository configuration is only applied for CentOS 8.

The ceph-mgr-dashboard package is only installed when the
dashboard_enabled variable is set to true.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 13:31:11 +02:00
Guillaume Abrioux 86dc6f8206 mds: don't enable application pool on cephfs pools
this commit removes the task which enable application on cephfs pools.

See: https://tracker.ceph.com/issues/43761

Fixes: #5278

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-23 13:23:10 +02:00
Paulo Matias 38ce02c2ea Allow user to specify grafana_server_fqdn
This is needed to get a TLS certificate to validate correctly.

If unspecified, auto-detected grafana_server_addr is used.

Signed-off-by: Paulo Matias <matias@ufscar.br>
2020-04-07 20:51:23 +02:00
Dimitri Savineau 4ac99223b2 rhcs: drop debian support
Support for debian with RHCS has been dropped starting RHCS 4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-27 04:36:36 +01:00
Dimitri Savineau 90ad110861 rhcs: update release to 5 for octopus
RHCS 5 will be based on Ceph Octopus release and only supported on
RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-26 22:00:08 +01:00
Guillaume Abrioux e551b5ba1a defaults: remove legacy comment
This is no longer true, let's remove this comment given that this option
is not ignored in containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-26 09:19:14 -04:00
Guillaume Abrioux b7ada14cf5 defaults: change nfs_ganesha_stable_branch
In master, even though we are using dev repo, the value here should be closer
from the last stable released.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-25 22:30:15 +01:00
Dimitri Savineau 706de944cf ceph-defaults: update ceph_stable_redhat_distro
Since octopus the ceph_stable_redhat_distro variable should be set to
el8 instead of el7.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-25 21:00:24 +01:00
Dimitri Savineau df8f853c85 Add pacific release
Add the 16th ceph release: pacific.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 09:47:12 +01:00
Guillaume Abrioux cc28d9ec26 nfs: fix nfs with external ceph cluster support
This commit refact and fix the nfs deployment with external ceph cluster
support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-19 18:21:16 -04:00
Dimitri Savineau fb69f6990c dashboard: allow to set read-only admin user
This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 15:34:41 +01:00
Dimitri Savineau 5051e67f8f ceph-defaults: add registry name on dashboard vars
We don't use the registry name when using the community dashboard
container images (grafana, prometheus, alertmanager & node exporter).
This commit adds the docker.io registry explicitly in the default
dashboard container image name values.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 14:27:50 +01:00
Dimitri Savineau b97a4d5201 ceph-defaults: update grafana container tag
Since 8e8aa73 we're using grafana 5.4.3 in RHCS 4.1 via [1].
We should also update the grafana container tag from docker.io when
using the community release.

[1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-17 14:35:06 +01:00
Boris Ranto 8e8aa735e0 rhcs_edits: Update grafana version
We are planning to release updated grafana image for ceph dashboard in
RHCS 4.1. We need to update the rhcs edut to point to the new image
then.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786107

Signed-off-by: Boris Ranto <branto@redhat.com>
2020-03-16 21:37:44 +01:00
Christian Berendt 608d7188a1 openstack_keys: use openstack_cinder_pool.name
Instead of volumes as a static string the openstack_cinder_pool.name
variable should be used as with the other keys.

Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
2020-03-09 08:17:22 -04:00
Ali Maredia 71f55bd54d rgw multisite: enable more than 1 realm per cluster
Make it so that more than one realm, zonegroup,
or zone can be created during a run of the rgw
multisite ansible playbooks.

The rgw hosts now need to be grouped into zones
and realms in the inventory.

.yml files need to be created in group_vars
for the realms and zones. Sample yaml files
are available.

Also remove multsite destroy playbook
and add --cluster before radosgw-admin commands

remove manually added rgw_zone_endpoints var
and have ceph-ansible automatically add the
correct endpoints of all the rgws in a rgw_zone
from the information provided in that rgws hostvars.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2020-03-04 12:58:13 -05:00
Guillaume Abrioux 47adc2bb08 osd: add pg autoscaler support
This commit adds the pg autoscaler support.

The structure for pool definition has now two additional attributes
`pg_autoscale_mode` and `target_size_ratio`, eg:

```
test:
  name: "test"
  pg_num: "{{ osd_pool_default_pg_num }}"
  pgp_num: "{{ osd_pool_default_pg_num }}"
  rule_name: "replicated_rule"
  application: "rbd"
  type: 1
  erasure_profile: ""
  expected_num_objects: ""
  size: "{{ osd_pool_default_size }}"
  min_size: "{{ osd_pool_default_min_size }}"
  pg_autoscale_mode: False
  target_size_ratio": 0.1
```

when `pg_autoscale_mode` is `True` user has to set a decent value in
`target_size_ratio`.

Given that it's a new feature, it's still disabled by default.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1782253

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Francesco Pantano 15ed9eebf1 Configure ceph dashboard backend and dashboard_frontend_vip
This change introduces a new set of tasks to configure the
ceph dashboard backend and listen just on the mgr related
subnet (and not on '*'). For the same reason the proper
server address is added in both prometheus and alertmanger
systemd units.
This patch also adds the "dashboard_frontend_vip" parameter
to make sure we're able to support the HA model when multiple
grafana instances are deployed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792230
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
2020-02-19 17:52:53 -05:00
Sam Choraria 2a2656a985 ceph-rgw: allow SSL certificate content to supplied
Allow SSL certificate & key contents to be written to the path
specified by radosgw_frontend_ssl_certificate. This permits a
certificate to be deployed & renewal of expired certificates
through ceph-ansible.

Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk>
2020-02-17 16:22:11 +01:00
Dimitri Savineau c644ea9041 ceph-defaults: remove bootstrap_dirs_xxx vars
Both bootstrap_dirs_owner and bootstrap_dirs_group variables aren't
used anymore in the code.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 16:17:40 +01:00
Ali Maredia 1834c1e48d rgw: extend automatic rgw pool creation capability
Add support for erasure code pools.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148

Signed-off-by: Ali Maredia <amaredia@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 16:07:43 +01:00
Dimitri Savineau 1fc6b33714 ceph-{mon,osd}: move default crush variables
Since ed36a11 we move the crush rules creation code from the ceph-mon to
the ceph-osd role.
To keep the backward compatibility we kept the possibility to set the
crush variables on the mons side but we didn't move the default values.
As a result, when using crush_rule_config set to true and wanted to use
the default values for crush_rules then the crush rule ansible task
creation will fail.

"msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute
'crush_rules'"

This patch move the default crush variables from ceph-mon to ceph-osd
role but also use those default values when nothing is defined on the
mons side.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798864

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 10:50:53 +01:00
Dimitri Savineau b9d975385c ceph-prometheus: add alertmanager HA config
When using multiple alertmanager nodes (via the grafana-server group)
then we need to specify the other peers in the configuration.

https://prometheus.io/docs/alerting/alertmanager/#high-availability
https://github.com/prometheus/alertmanager#high-availability

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792225

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 10:46:21 +01:00
Guillaume Abrioux 3700aa5385 switch_to_containers: increase health check values
This commit increases the default values for the following variable
consumed in switch-from-non-containerized-to-containerized-ceph-daemons.yml
playbook.
This also moves these variables in `ceph-defaults` role so the user can
set different values if needed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783223

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-02-07 14:59:14 -05:00
Dimitri Savineau cba0c8c063 Revert "rhcs: update container image name"
This wasn't necesarry. The container image was fixed on the
RedHat's registry

This reverts commit 3bd250c742.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-05 15:22:56 +01:00
Dimitri Savineau 3bd250c742 rhcs: update container image name
The RHCS 4 container image is rhceph/rhceph-4

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797743

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-04 16:10:05 -05:00
Dimitri Savineau 9b40a959b9 ceph-common: rhcs 4 repositories for rhel 7
RHCS 4 is available for both RHEL 7 and 8 so we should also enable the
cdn repositories for that distribution.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796853

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-31 09:33:51 -05:00
Dimitri Savineau 2f07b85131 ceph-defaults: remove rgw from ceph_conf_overrides
The [rgw] section in the ceph.conf file or via the ceph_conf_overrides
variable doesn't exist and has no effect.
To apply overrides to all radosgw instances we should use either the
[global] or [client] sections.
Overrides per radosgw instance should still use the
[client.rgw.{instance-name}] section.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794552

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-29 14:11:14 +01:00