Commit Graph

5123 Commits (86d59792696cba9aee848ab5f86f23f64f07717f)
 

Author SHA1 Message Date
Guillaume Abrioux 86d5979269 osd: add a default value for 'default' in crush_rules
Let's default to `False` for the `default` attribute in `crush_rules`
variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797774

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1b0b7af119)
2020-06-03 13:20:40 -04:00
Dimitri Savineau a97e24fee9 docker2podman: manage dashboard nodes
The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 252e78b4e4)
2020-06-03 13:20:24 -04:00
Dimitri Savineau 6f893e5ed9 docker2podman: pull images from docker daemon
The docker2podman playbook only installs the podman package and updates
the systemd units with the right container_binary value.

We never pull the container image so if one service is restarted then
the container image will be pulled first before the service can start
which could cause longer downstream.

To avoid to download the container image from internet again we can just
pull it from the local docker daemon.

The container_{binding,package,service}_name variables are removed
because they are only used in the ceph-container-engine role which
isn't call in this playbook.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d38f21aeba)
2020-06-03 13:20:24 -04:00
Dimitri Savineau 8c4865cd14 rolling_update: fix rbdmirror group name
The rbdmirror group name was using the wrong variable definition.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c0a213f928)
2020-06-03 13:20:03 -04:00
Dimitri Savineau ffd28abb45 ceph-nfs: bind mount ganesha log directory
The current ganesha log directory is only present in the container
and not bind mount on the host.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 222fe4abd8)
2020-06-03 13:19:47 -04:00
Dimitri Savineau 1921ace52d docker-to-podman: conditional docker commands
The docker commands should be based on the container_binary variable
otherwise running the playbook on a host without docker (like podman
only) will failed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829985

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-03 13:19:28 -04:00
Benoît Knecht 8ae4bbd8ca ceph-validate: Expand templates in rgw_create_pools
Same fix as `ceph-rgw` for `rgw_create_pools` pool names that contain Jinja
templates.

See #5348 for details.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 444b46ea24)
2020-06-03 13:18:43 -04:00
Benoît Knecht e454b34b92 ceph-rgw: Make sure pool name templates are expanded
It is common to set templated pool names in `rgw_create_pools`, e.g.

```yaml
rgw_create_pools:
  "{{ rgw_zone }}.rgw.buckets.index":
    pg_num: 16
    size: 3
    type: replicated
```

This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in
the way `with_dict` works [1].

This commit replaces the use of `with_dict` with

```yaml
loop: "{{ rgw_create_pools | dict2items }}"
```

which works as intended and expands the template in the pool name.

[1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops

Closes #5348

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit d2b7670c7d)
2020-06-03 13:18:43 -04:00
Dimitri Savineau 8c4190e243 ceph-facts: fix IPv6 _radosgw_address interface
When using radosgw_interface and IPv6 setup then the _radosgw_address
fact doesn't use square brackets compared to the radosgw_address and
radosgw_address_block configuration.

Closes: #5325

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ed4f23d530)
2020-06-03 13:18:33 -04:00
fmount 7b5dba4488 Refresh ceph dashboard user role
This change allows the operator to refresh the
ceph dashboard admin role on multiple ceph-ansible
executions.
In the current state the role is set only when the
user is created, and there's no way to change it if
the user exists.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826002
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 5eb363e033)
2020-06-03 13:18:18 -04:00
Guillaume Abrioux c2335b597f mds: don't enable application pool on cephfs pools
this commit removes the task which enable application on cephfs pools.

See: https://tracker.ceph.com/issues/43761

Fixes: #5278

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86dc6f8206)
2020-06-03 13:17:48 -04:00
Guillaume Abrioux 66bdd585da test: set sitepackages=false in tox
Otherwise it might try to use the system installed version of ansible
when there's one available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9acb5e6d)
2020-05-14 11:35:08 -04:00
Ali Maredia 60124f5bde docs: minor fixes to README-MULTISITE.md
Make all of the hosts start at 1 and not 0,
also make some minor changes in scenario 3 to
remova an inconsistency.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit bd1440f2cd)
2020-05-08 12:14:33 -04:00
Dimitri Savineau 487dcdc3f0 ceph-rgw: use match instead of equalto from jinja2
The '==' jinja2 operator (or 'equalto') has been introduced in jinja2
2.8.
On EL7, jinja2 version is 2.7 so the operator isn't present creating
templating error like:

The error was: TemplateRuntimeError: no test named '=='

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 34e6e8e06c)
2020-05-06 15:45:44 -04:00
Dimitri Savineau 3e421ad6e6 ceph-nfs: fix internal ganesha deployment
Since ea2b654d9 we're not running the rados command from the monitor
nodes but from the ganesha node. Unfortunately we don't have the
required keyring on that node to run the rados command as we don't
import the right keyring.
This commit restores the workflow for internal ganesha deployment like
before ea2b654d9 but keeps the rados commands from the ganesha node for
external deployment until we have a better design.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8a890306ad)
2020-05-06 13:30:21 -04:00
Dimitri Savineau da04dfdddf ceph-nfs: fix keyring copy for external ganesha
Fix the condition on the keyring copy task that prevent the ganesha
keyring to be created in the /var/lib/ceph directory.
Also ensure that the directory exists first.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 748ac4b928)
2020-05-06 13:30:21 -04:00
Guillaume Abrioux 19d0db1c7b nfs: fix 2 typo
The condition is missing an index here which makes the playbook failing.

Typical error:
```
The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'",
```

Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cf460274c7)
2020-05-06 13:30:21 -04:00
Dimitri Savineau b77e2b64ce ceph-dashboard: fix mgr dashboard IPv6 fact
15ed9ee introduced a regression for the mgr dashboard daemon using
IPv6 since the mgr dashboard configuration doesn't support brackets.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827299

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f1728929cd)
2020-04-23 16:23:30 -04:00
Ali Maredia 2e009dcd59 docs: fix multisite docs add endpoints var in rgw_instances section
+ Mention of this variable was missing in the original version.

+ Minor revisions around the concept of secondary zone.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 2b32604577)
2020-04-23 15:27:07 -04:00
Ali Maredia a515937d78 docs: Update and consolidate rgw multisite documentation
Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit afa78bd0c0)
2020-04-23 15:18:34 -04:00
ianwatsonrh ba4144544f typo: updating type check on rc
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826884
Signed-off-by: ianwatsonrh <ianwatson@redhat.com>
(cherry picked from commit ccf6a7f153)
2020-04-23 17:04:17 +02:00
Guillaume Abrioux 8b26cd16f4 doc: update release note
This commit mentions:
- the Debian support removal from RHCS deployment.
- site-docker.yml.sample and purge-docker-cluster.yml symlinks removal.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-23 13:27:24 +02:00
Dimitri Savineau d117d08a6f ceph-container-engine: add CentOS 8 support
This adds CentOS 8 support for containerized deployment allowing podman
installation as the default container engine for this distribution.

Closes: #5130

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 13:26:29 +02:00
Guillaume Abrioux 7e68a2a5d8 doc: add day-2 operations documentation
This commit is the first of a serie in order to describe all day-2 operations
that are possible via ceph-ansible using a set of playbook provided in
`infrastructure-playbooks` directory.

Fixes: #5061

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e800303e9)
2020-04-21 10:19:24 -04:00
Dimitri Savineau 46c640c169 filestore-to-bluestore: fix py2 on skipped tasks
When using skipped variables with from_json filter and python2 then we
need to have a default value otherwise the skipped task will fail.

Unexpected templating type error occurred on
({{ (ceph_volume_lvm_list.stdout | from_json) }}): expected string or
buffer

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790472

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2b9edba131)
2020-04-20 13:38:19 -04:00
abaird-rh 6878aab0f9 Updated use of deprecated filter
This was removed in Ansible 2.9.

[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
using `result|version_compare` use `result is version_compare`. This
feature will be removed in version 2.9. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.

Rename 'version_compare' to the function 'version'.

version_compose was renamed to version since ansible 2.5

Signed-off-by: abaird-rh <abaird@redhat.com>
(cherry picked from commit eb71244bfd)
2020-04-20 13:37:42 -04:00
Rishabh Dave e4c24f3407 library/ceph_volume: look for error messages in stderr
Error message were moved to from stdout in stderr here -
b8d6dcbe9f (diff-20f7c578a4e69ec61a5869d706567a24R137).

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1793542
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 4249d1e02d)
2020-04-20 13:36:15 -04:00
Guillaume Abrioux 0ace5f5f2c mds: fix --limit run against mds nodes
This commit fixes --limit runs against mds nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 378405e328)
2020-04-14 13:42:45 -04:00
Guillaume Abrioux 945266776b nfs: create empty rados index object for nfs standalone
This commit creates an empty rados index object even when deploying
standalone nfs-ganesha.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ea2b654d95)
2020-04-14 12:03:38 -04:00
Dimitri Savineau 1033ad191b ceph-validate: update RHEL requirement for RHCS
We were not testing the right ansible_distribution fact value for RHEL
distribution.
This commit also updates the minial RHEL version supported by RHCS.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5de74fe512)
2020-04-14 11:27:01 -04:00
Guillaume Abrioux 1b79d73729 osd: fix monitor_name error when scaling out OSDs
This commit fixes a bug when trying to scale out osd nodes with
`crush_rule_config` is enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4bcc52cb2a)
2020-04-10 13:44:15 +02:00
Dimitri Savineau e34c95d28f tests: update mgr dashboard socket listening test
Since 15ed9ee the ceph-mgr daemon binds on the IP address on the public
network instead of binding on all addresses.
This commit updates the testinfra code to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0f0a14772c)
2020-04-07 15:25:45 +02:00
Dimitri Savineau 47be2a2719 tests: register mark in pytest configuration
Unregister marks generates warnings like:

PytestUnknownMarkWarning: Unknown pytest.mark.docker - is this a typo?
You can register custom marks to avoid this warning

https://docs.pytest.org/en/latest/mark.html

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ac4f8763aa)
2020-04-07 15:25:45 +02:00
Dimitri Savineau 1cfb84ae94 tests: add dashboard testinfra configuration
This commit adds basic tests for grafana, prometheus, node-exporter and
ceph mgr dashboard services.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2c6281207)
2020-04-07 15:25:45 +02:00
Dimitri Savineau 6a2272b9c0 ceph-mgr: add saml python lib for dashboard SSO
The dashboard SSO mgr module requires the saml python library to be
installed. This is only a valid scenario for RHCS deployment because
the saml python library isn't available in other classic repositories.
This package is present in RHCS Tools repository so we also need to
enable it on the mgr nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6617d90733)
2020-04-06 11:00:01 -04:00
Guillaume Abrioux 4bc7bb2766 ceph_key: fetch key when needed
Fetch the key when it is present in the cluster but not on the node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccfa249919)
2020-04-03 15:14:54 -04:00
Guillaume Abrioux cbed1eb17a ceph_key: fix idempotency when no secret is passed
553584cbd0 introduced a regression when no
secret is passed, it overwrites the secret each time the task is run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 003defec03)
2020-04-03 11:04:30 -04:00
Dimitri Savineau a472064cb8 tox: replace testinfra by pytest for add-mgrs
The add-mgrs scenario is still using the testinfra command instead of
pytest so the tests exectution are failling.

ERROR: InvocationError for command could not find executable testinfra

This also adds the missing --ssh-config option to testinfra.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 92f538f1af)
2020-04-03 10:43:21 -04:00
Guillaume Abrioux ba6bd3ca3d docker2podman: call `container_options_facts.yml` on osd nodes
We must call `ceph-osd` role from `container_options_facts.yml` because
ceph-osd-run.sh.j2 needs variables set in this file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1819681

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4a4f54f6ee)
2020-04-02 11:01:14 -04:00
Guillaume Abrioux 825aed5ec1 ceph_key: remove 'update' state
With this change, the state `present` is enough to update a keyring.
If the keyring already exist, it will be updated if caps or secret
passed to the module are different.
If the keyring doen't exist, it will be created.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808367

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 553584cbd0)
2020-04-01 18:08:51 -04:00
Guillaume Abrioux 7acd9686ab osd: use default crush rule name when needed
When `rule_name` isn't set in `crush_rules` the osd pool creation will
fail.
This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with
the default crush rule name.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bb9860dfd)
2020-03-31 19:42:40 -04:00
Guillaume Abrioux 03355aec8c tests: add more coverage in external_clients scenario
Run create_users_keys.yml in external_clients scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8c1c34b201)
2020-03-31 19:42:40 -04:00
Guillaume Abrioux 87e1f0cc6c osd: support changing default rule even when osd_crush_location isn't defined
Creating crush rules even with no crush hierarchy configuration is a
valid scenario so we shouldn't be bound to the first task result (which
configure crush hierarchy) to be able to add new crush rules.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5b0476385c)
2020-03-31 23:14:55 +02:00
Guillaume Abrioux 32f879de32 purge-container: get *all* osds id
Adding `--all` to the `systemctl list-units` command in order to get
*all* osds id on the node (including stoppped osds). Otherwise, it will
purge the cluster but there will be leftover after that.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814542

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5e7962ccf6)
2020-03-31 11:00:41 -04:00
Javier Pena 5860335ebe Fixes for Makefile
- Set default mock configuration to epel-8-x86_64, to match the
  default dist value.
- Add support for alpha tags, like the recently added v5.0.0alpha1

Signed-off-by: Javier Pena <jpena@redhat.com>
(cherry picked from commit 19a43ff261)
2020-03-30 22:31:04 +02:00
Javier Pena eb19fe3ece Allow setting dist and mock configuration in Makefile
Curently, the dist and mock configurations are hardcoded in the
Makefile to be el8 and epel-7-x86_64, respectively. This commit
allows the user to override those settings using the DIST and
MOCK_CONFIG environment variables, falling back to the current
defaults if not set.

This provides additional flexibility when building the RPM directly
from the repository.

Signed-off-by: Javier Peña <jpena@redhat.com>
(cherry picked from commit ed8341568b)
2020-03-30 22:31:04 +02:00
Dimitri Savineau dcd02e6494 ceph_volume: fix multiple db/wal devices
When using the lvm batch ceph-volume subcommand with dedicated devices
for bluestore (db/wal) then the list of devices is convert to a string
instead of being extended via an iterable.
This was working with only one dedicated device but starting with more
then the ceph_volume module fails.

TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] **
fatal: [xxxxxx]: FAILED! => changed=true
  cmd:
  - ceph-volume
  - --cluster
  - ceph
  - lvm
  - batch
  - --bluestore
  - --yes
  - --prepare
  - --osds-per-device
  - '4'
  - /dev/nvme2n1
  - /dev/nvme3n1
  - /dev/nvme4n1
  - /dev/nvme5n1
  - /dev/nvme6n1
  - --db-devices
  - /dev/nvme0n1 /dev/nvme1n1
  - --report
  - --format=json
  msg: non-zero return code
  rc: 2
  stderr: |2-
     stderr: lsblk: /dev/nvme0n1 /dev/nvme1n1: not a block device
     stderr: error: /dev/nvme0n1 /dev/nvme1n1: No such file or directory
     stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
    usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
                                 [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
                                 [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
                                 [--no-auto] [--bluestore] [--filestore]
                                 [--report] [--yes] [--format {json,pretty}]
                                 [--dmcrypt]
                                 [--crush-device-class CRUSH_DEVICE_CLASS]
                                 [--no-systemd]
                                 [--osds-per-device OSDS_PER_DEVICE]
                                 [--block-db-size BLOCK_DB_SIZE]
                                 [--block-wal-size BLOCK_WAL_SIZE]
                                 [--journal-size JOURNAL_SIZE] [--prepare]
                                 [--osd-ids [OSD_IDS [OSD_IDS ...]]]
                                 [DEVICES [DEVICES ...]]
    ceph-volume lvm batch: error: Unable to proceed with non-existing device: /dev/nvme0n1 /dev/nvme1n1

So the dedicated device list is considered as a single string.

This commit also adds the block_db_devices and wal_devices documentation
to the ceph_volume module.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816713

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 760b6cd7b0)
2020-03-30 10:04:26 -04:00
Dimitri Savineau 1318575626 ceph-defaults: update container tag for nautilus
The latest Ceph stable release is now Octopus so the "latest" container
image tag is pointing to Octopus and not Nautilus anymore.
This commit updates the ceph_docker_image_tag with "latest-nautilus".

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-30 09:55:05 +02:00
Dimitri Savineau 62042f370a rhcs: drop debian support
Support for debian with RHCS has been dropped starting RHCS 4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4ac99223b2)
2020-03-27 10:22:43 -04:00
Dimitri Savineau fcbb03b0b5 Add a release note
This adds a release not for the Ceph Nautilus release used in the
stable-4.0 branch.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-26 18:12:16 -04:00