Commit Graph

1830 Commits (824ec6d256fc23794d69dd82f789fb05ef5c7bb6)

Author SHA1 Message Date
Michel Rode 7774935707 Added 'squash' as a parameter to nfs-ganesha.
Set the default to 'root_squash' - which is the default of nfs-ganesha.

Signed-off-by: Michel Rode <rmichel@devnu11.net>
2018-06-25 09:13:17 +02:00
Christian Zunker 48394597c9 reset failed count of ceph-mgr
Depending on your setup, ceph-mgr might get restarted multiple times.
When this is done to fast, systemd will prevent further restarts because of
configured limits in the ceph-mgr systemd unit file.

Resetting the failure count will prevent this problem. The reset is done before
the restart so in case of a real problem during the restart it still fails.

Fixes: #2768

Signed-off-by: Christian Zunker <christian.zunker@codecentric.cloud>
2018-06-20 13:59:16 +02:00
Sébastien Han bea4027f0c common: start firewalld if configure_firewall
Currently we expect that if configure_firewall is set to True to have
firewalld enabled and running. Let's enforce that.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1589146
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-18 04:02:50 -04:00
Sébastien Han a9ed3579ae mon/osd: bump container memory limit
As discussed with the cores, the current limits are too low and should
be bumped to higher value.
So now by default monitors get 3GB and OSDs get 5GB.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1591876
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-17 11:20:27 -04:00
Guillaume Abrioux 51cf3b7fa0 client: try to kill dummy container only on first client node
The 'dummy' container is created only on first client node, it means we
must seek to destroy this container only on this node, otherwise this
can cause failure like following :
```
fatal: [192.168.24.8]: FAILED! => {"changed": false, "cmd": ["docker", "rm",
"-f", "ceph-create-keys"], "delta": "0:00:00.023692", "end": "2018-06-12
20:56:07.261278", "msg": "non-zero return code", "rc": 1, "start":
"2018-06-12 20:56:07.237586", "stderr": "Error response from daemon: No such
container: ceph-create-keys", "stderr_lines": ["Error response from daemon: No
such container: ceph-create-keys"], "stdout": "", "stdout_lines": []}

```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590746

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-13 16:10:46 +02:00
Patrick Donnelly 9ce81ae845 ceph-mds: do not enable multimds on jewel
Multiple active MDS became stable in Luminous.

Introduced-by: c8573fe0d7
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-12 10:47:34 +02:00
Sébastien Han 2e8412734a common: ability to enable/disable fw configuration
Prior to this patch if you were running on a Red Hat system,
ceph-ansible would try to configure firewalld for you without the
operators's consent.
Now you can enable or disable the fw configuration by setting
configure_firewall to either true or false.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1589146
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-11 21:51:59 +02:00
Konstantin Shalygin 3a07568496 ceph-osd: set 'openstack_keys_tmp' only when 'openstack_config' is defined.
If 'openstack_config' is false this task shouldn't be executed.

Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>
2018-06-11 13:03:55 +02:00
Vishal Kanaujia 1a610df02b Fix to run secure cluster only once in a run
The current secure cluster play runs with all the monitors. The rerun
of this task is unnecessary and can be skipped.

Fixes: #2737

Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>
2018-06-11 08:37:29 +02:00
Guillaume Abrioux 090ecff94e client: keyrings aren't created when single client node
combining `run_once: true` with `inventory_hostname ==
groups.get(client_group_name) | first` might cause bug when the only
node being run is not the first in the group.

In a deployment with a single client node it might cause issue because
sometimes keyring won't be created since the task could be definitively
skipped.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-08 15:05:47 +02:00
Sébastien Han 20c8065e48 ceph-iscsi: rename group iscsi_gws
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Sébastien Han 91bf53ee93 ceph-iscsi: support for containerize deployment
We now have the ability to deploy a containerized version of ceph-iscsi.
The result is similar to the non-containerized version, you simply have
3 containers running for the following services:

* rbd-target-api
* rbd-target-gw
* tcmu-runner

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508144
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Guillaume Abrioux 8a653cacd5 client: add a default value for keyring file
Potential error if someone doesnt pass the mode in `keys` dict for
client nodes:

```
fatal: [client2]: FAILED! => {}

MSG:

The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'mode'

The error appears to have been in '/home/guits/ceph-ansible/roles/ceph-client/tasks/create_users_keys.yml': line 117, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- name: get client cephx keys
  ^ here

exception type: <class 'ansible.errors.AnsibleUndefinedVariable'>
exception: 'dict object' has no attribute 'mode'

```

adding a default value will avoid the deployment failing for this.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-07 17:26:35 +02:00
Guillaume Abrioux 5eacc8f8d8 tests: add a dummy value for 'dev' release
Functional tests are broken when testing against 'dev' release (ceph).
Adding a dummy value here will make it possible to run ceph-ansible CI
against dev ceph release.

Typical error:

```
>       if request.node.get_marker("from_luminous") and ceph_release_num[ceph_stable_release] < ceph_release_num['luminous']:
E       KeyError: 'dev'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fd1487d93f21b609a637053f5b33cd2a4e408d00)
2018-06-07 13:59:17 +02:00
Andrew Schoen 24ef47b0e5 ceph-common: move firewall checks after package installation
We need to do this because on dev or rhcs installs ceph_stable_release
is not mandatory and the firewall check tasks have a task that is
conditional based off the installed version of ceph. If we perform those
checks after package install then they will not fail on dev or rhcs
installs.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-06-07 13:59:17 +02:00
Guillaume Abrioux 7b156deb67 client: use dummy created container when there is no mon in inventory
the `docker_exec_cmd` fact set in client role when there is no monitor
in inventory is wrong, `ceph-client-{{ hostname }}` is never created so
it will fail anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-07 16:16:38 +08:00
Guillaume Abrioux 433ecc7cbc osd: copy openstack keys over to all mon
When configuring openstack, the created keyrings aren't copied over to
all monitors nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-07 13:58:57 +08:00
Patrick Donnelly 91f9da530f change max_mds default to 1
Otherwise, with the removal of mds_allow_multimds, the default of 3 will be set
on every new FS.

Introduced by: c8573fe0d7

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1583020
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-06 12:16:42 +08:00
Vishal Kanaujia 2cdb0d1812 Syntax error fix in rgw multisite role
This checkin fixes a syntax error in RGW multisite role under when
clause.

Fixes: #2704

Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>
2018-06-05 16:01:07 +05:30
Guillaume Abrioux 2cf06b515f rgw: refact rgw pools creation
Refact of 8704144e31
There is no need to have duplicated tasks for this. The rgw pools
creation should be delegated on a monitor node se we don't have to care
if the admin keyring is present on rgw node.
By the way, only one task is needed to create the pools, we just need to
use the `docker_exec_cmd` fact already defined in `ceph-defaults` to
achieve it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-05 15:00:20 +08:00
Ha Phan 1f3c9ce4f3 Use python instead of python2
The initial keyring is generated from ansible server locally and the snippet works well for both v2 and v3 of python.

I don't see any reason why we should explicitly invoke`python2` instead of just `python`.

In some setups, `python2` is not symlinked to `python`; while `python` and `python3` refer to v2 and v3 respectively.

Signed-off-by: Ha Phan <thanhha.work@gmail.com>
2018-06-04 14:24:10 +02:00
Sébastien Han db50aec13d ceph-common: add firewall rules for ceph-mgr
Prior to this commit the firewall tasks were not opening the ceph-mgr
ports. This would lead to unclean configuration since the ceph-mgr
daemons can not connect to the OSDs.
Thi commit opens the right ports on the ceph-mgr nodes to talk with the
OSDs.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526400
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-04 12:11:41 +02:00
jtudelag 600e1e2c26 rgws: renames create_pools variable with rgw_create_pools.
Renamed to be consistent with the role (rgw) and have a meaningful name.

Signed-off-by: Jorge Tudela <jtudelag@redhat.com>
2018-06-04 06:23:42 +02:00
jtudelag 8704144e31 Adds RGWs pool creation to containerized installation.
ceph command has to be executed from one of the monitor containers
if not admin copy present in RGWs. Task has to be delegated then.

Adds test to check proper RGW pool creation for Docker container scenarios.

Signed-off-by: Jorge Tudela <jtudelag@redhat.com>
2018-06-04 06:23:42 +02:00
Guillaume Abrioux aae37b44f5 mons: move set_fact of openstack_keys in ceph-osd
Since the openstack_config.yml has been moved to `ceph-osd` we must move
this `set_fact` in ceph-osd otherwise the tasks in
`openstack_config.yml` using `openstack_keys` will actually use the
defaults value from `ceph-defaults`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1585139

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-01 17:12:01 +02:00
Andrew Schoen c2423e2c48 ceph-defaults: add the nautilus 14.x entry to ceph_release_num
The first 14.x tag has been cut so this needs to be added so that
version detection will still work on the master branch of ceph.

Fixes: https://github.com/ceph/ceph-ansible/issues/2671

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-06-01 16:51:23 +02:00
Guillaume Abrioux 9d5265fe11 osds: wait for osds to be up before creating pools
This is a follow up on #2628.
Even with the openstack pools creation moved later in the playbook,
there is still an issue because OSDs are not all UP when trying to
create pools.

Adding a task which checks for all OSDs to be UP with a `retries/until`
condition should definitively fix this issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-01 15:46:52 +02:00
Guillaume Abrioux c68126d6fd mdss: do not make pg_num a mandatory params
When playing ceph-mds role, mon nodes have set a fact with the default
pg num for osd pools, we can simply default to this value for cephfs
pools (`cephfs_pools` variable).

At the moment the variable definition for `cephfs_pools` looks like:

```
cephfs_pools:
  - { name: "{{ cephfs_data }}", pgs: "" }
  - { name: "{{ cephfs_metadata }}", pgs: "" }
```

and we have a task in `ceph-validate` to ensure `pgs` has been set to a
valid value.

We could simply avoid this check by setting the default value of `pgs`
to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and
let to users the possibility to override this value.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-30 16:20:34 +02:00
Guillaume Abrioux 34e646e767 osds: do not set docker_exec_cmd fact
in `ceph-osd` there is no need to set `docker_exec_cmd` since the only
place where this fact is used is in `openstack_config.yml` which
delegate all docker command to a monitor node. It means we need the
`docker_exec_cmd` fact that has been set referring to `ceph-mon-*`
containers, this fact is already set earlier in `ceph-defaults`.

By the way, when collocating an OSD with a MON it fails because the container
`ceph-osd-{{ ansible_hostname }}` doesn't exist.

Removing this task will allow to collocate an OSD with a MON.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1584179

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-30 16:17:29 +02:00
Guillaume Abrioux 608ea947a9 mds: move mds fs pools creation
When collocating mds on monitor node, the cephpfs will fail
because `docker_exec_cmd` is reset to `ceph-mds-monXX` which is
incorrect because we need to delegate the task on `ceph-mon-monXX`.
In addition, it wouldn't have worked since `ceph-mds-monXX` container
isn't started yet.

Moving the task earlier in the `ceph-mds` role will fix this issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-25 11:16:56 +02:00
Sébastien Han 1c084efb3c rgw: container add option to configure multi-site zone
You can now use RGW_ZONE and RGW_ZONEGROUP on each rgw host from your
inventory and assign them a value. Once the rgw container starts it'll
pick the info and add itself to the right zone.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1551637
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-05-24 11:32:05 -07:00
Guillaume Abrioux 3a0e168a76 mdss: move cephfs pools creation in ceph-mds
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.

The idea here is to move cephfs pools creation in `ceph-mds` role.

[1] e59258943b/src/mon/OSDMonitor.cc (L5673)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-24 09:39:38 -07:00
Guillaume Abrioux 564a662baf osds: move openstack pools creation in ceph-osd
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.

The idea here is to move openstack pools creation at the end of `ceph-osd` role.

[1] e59258943b/src/mon/OSDMonitor.cc (L5673)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-24 09:39:38 -07:00
Luigi Toscano 43e96c1f98 ceph-radosgw: disable NSS PKI db when SSL is disabled
The NSS PKI database is needed only if radosgw_keystone_ssl
is explicitly set to true, otherwise the SSL integration is
not enabled.

It is worth noting that the PKI support was removed from Keystone
starting from the Ocata release, so some code paths should be
changed anyway.

Also, remove radosgw_keystone, which is not useful anymore.
This variable was used until fcba2c801a.
Now profiles drives the setting of rgw keystone *.

Signed-off-by: Luigi Toscano <ltoscano@redhat.com>
2018-05-23 23:24:09 -07:00
Vishal Kanaujia ef5f52b1f3 Skip GPT header creation for lvm osd scenario
The LVM lvcreate fails if the disk already has a GPT header.
We create GPT header regardless of OSD scenario. The fix is to
skip header creation for lvm scenario.

fixes: https://github.com/ceph/ceph-ansible/issues/2592

Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>
2018-05-23 11:44:09 -07:00
Subhachandra Chandra c7e269fcf5 Fix restarting OSDs twice during a rolling update.
During a rolling update, OSDs are restarted twice currently. Once, by the
handler in roles/ceph-defaults/handlers/main.yml and a second time by tasks
in the rolling_update playbook. This change turns off restarts by the handler.
Further, the restart initiated by the rolling_update playbook is more
efficient as it restarts all the OSDs on a host as one operation and waits
for them to rejoin the cluster. The restart task in the handler restarts one
OSD at a time and waits for it to join the cluster.
2018-05-22 19:23:07 +02:00
Andrew Schoen a9ad8eb5f3 ceph-validate: do not check ceph version on dev or rhcs installs
A dev or rhcs install does not require ceph_stable_release to be set and
instead generates that by looking at the installed ceph-version.
However, at this point in the playbook ceph may not have been installed
yet and ceph-common has not be run.

Fixes: https://github.com/ceph/ceph-ansible/issues/2618

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-21 23:11:04 +02:00
Andrew Schoen e7d02a50d8 ceph-validate: move system checks from ceph-common to ceph-validate
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 645f61c351 ceph-defaults: remove backwards compat for containerized_deployment
The validation module does not get config options with the template
syntax rendered, so we're gonna remove that and just default it to
False. The backwards compat was schedule to be removed in 3.1 anyway.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen d30a99c350 validate: add support for containerized_deployment
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen f84c2ba27b ceph-defaults: fix failing tasks when osd_scenario was not set correctly
When devices is not defined because you want to use the 'lvm'
osd_scenario but you've made a mistake selecting that scenario these
tasks should not fail.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 1f15a81c48 ceph-defaults: move cephfs vars from the ceph-mon role
We're doing this so we can validate this in the ceph-validate role

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen ffe05872ac validate: only validate cephfs_pools on mon nodes
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 48c2a4fda8 validate: check rados config options
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 377fe81c10 validate: make sure ceph_stable_release is set to the correct value
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen ba7f09c0a7 ceph-validate: move var checks from ceph-common into this role
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 32bac6b491 ceph-validate: move var checks from ceph-osd into this role
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 29a9dffc83 ceph-validate: move ceph-mon config checks into this role
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen d87a32347f adds a new ceph-validate role
This will be used to validate config given to ceph-ansible.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Sébastien Han 2f43e9dab5 defaults: restart_osd_daemon unit spaces
Extra space in systemctl list-units can cause restart_osd_daemon.sh to
fail

It looks like if you have more services enabled in the node space
between "loaded" and "active" get more space as compared to one space
given in command the command[1].

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1573317
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-05-18 17:53:47 +02:00