Commit Graph

80 Commits (b1a3c6e4cc87509ade8ab8044fc53c324182c7b2)

Author SHA1 Message Date
Sébastien Han 6f9dd26caa config: remove any spaces in public_network or cluster_network
With two public networks configured - we found that with
"NETWORK_ADDR_1, NETWORK_ADDR_2" install process consistently became
broken, trying to find docker registry on second network, and not
finding mon container.

but without spaces
"NETWORK_ADDR_1,NETWORK_ADDR_2" install succeeds
so, containerized install is more peculiar with formatting of this line

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1534003
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-01-30 17:47:15 +01:00
Andy McCrae 481173f203 Add default for radosgw_keystone_ssl
This should default to False. The default for Keystone is not to use PKI
keys, additionally, anybody using this setting had to have been manually
setting it before.

Fixes: #2111
2018-01-30 11:30:23 +01:00
Guillaume Abrioux f1232b33fd Revert "monitor_interface: document need to use monitor_address when using IPv6"
This reverts commit 10b91661ce.

This reverts also the same comment added in
1359869497
2018-01-29 14:43:24 +01:00
Guillaume Abrioux ec16cbdb1a defaults: avoid getting stuck (ceph --connect-timeout)
Sometime the playbook gets stuck because even with `--connect-timeout=`
option, the connexion to the existing ceph cluster never timeout.

As a workaround, using `timeout` command provided by coreutils will
actually timeout if we can't connect to the cluster.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537003

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-25 10:15:59 +01:00
Guillaume Abrioux 900f447c82 containers: fix bug when looking for existing cluster
When containerized deployment, `docker_exec_cmd` is not set before the
task which try to retrieve the current fsid is played, it means it
considers there is no existing fsid and try to generate a new one.

Typical error:

```
ok: [mon0 -> mon0] => {
    "changed": false,
    "cmd": [
        "ceph",
        "--connect-timeout",
        "3",
        "--cluster",
        "test",
        "fsid"
    ],
    "delta": "0:00:00.179909",
    "end": "2018-01-09 10:36:58.759846",
    "failed": false,
    "failed_when_result": false,
    "rc": 1,
    "start": "2018-01-09 10:36:58.579937"
}

STDERR:

Error initializing cluster client: Error('error calling conf_read_file: errno EINVAL',)
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-10 16:23:18 +01:00
Guillaume Abrioux acfbebe67e defaults: rename check_socket files for containers
When containerized deployment, we are not looking for a socket but for a
running container.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-10 15:44:47 +01:00
Sébastien Han 7eaf444328 default: look for the right return code on socket stat in-use
As reported in https://github.com/ceph/ceph-ansible/issues/2254, the
check with fuser is not ideal. If fuser is not available the return code
is 127. Here we want to make sure that we looking for the correct return
code, so 1.

Closes: https://github.com/ceph/ceph-ansible/issues/2254
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-12-14 16:59:14 +01:00
Eduard Egorov a8a2c13f6a firewall: add mds, nfs, restapi and iscsi ports, remove 'configure_firewall' variable used for conditional execution. Include the task only on rpm-based systems.
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
2017-12-12 23:44:55 +01:00
Eduard Egorov 6a5e0da30d firewall: configure firewalld if it's already installed on the host (#2192).
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
2017-12-12 23:44:55 +01:00
Major Hayden 5676fa23b1 Convert interface names to underscores for facts
If a deployer uses an interface name with a dash/hyphen in it, such
as 'br-storage' for the monitor_interface group_var, the ceph.conf.j2
template fails to find the right facts. It looks for
'ansible_br-storage' but only 'ansible_br_storage' exists.

This patch converts the interface name to underscores when the
template does the fact lookup.
2017-12-12 09:03:40 +01:00
Guillaume Abrioux 6a9b5c9632 defaults: fix CI issue with ceph_uid fact
The CI complains because of `ceph_uid` fact which doesn't exist since
the docker image tag used in the CI doesn't match with this condition.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-12-12 09:02:37 +01:00
John Fulton ffae294288 Set tighter permissions on keyrings when containerized
During a containerized deployment, set the permissions
of ceph.client.admin.keyring and other keyrings to
chmod 600 and chown it to ceph.
2017-12-06 19:22:28 -05:00
Guillaume Abrioux b26a840002 handlers: restart daemons only if docker is running
In case where docker CLI is available but docker is not running, we
don't want to trigger the restart of the daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-27 14:59:30 +01:00
Sébastien Han cc264d6ba6
Merge pull request #2151 from hwoarang/add-opensuse
Add openSUSE Leap 42.3 support
2017-11-16 14:35:28 +01:00
Markos Chandras 849786967a ceph-common: Add initial support for openSUSE Leap distributions
openSUSE Leap 42.3 provides support for Ceph Luminous in both the
distribution package and the latest available version in the OBS
repository so add these as the only available installation methods for
openSUSE.

Signed-off-by: Markos Chandras <mchandras@suse.de>
2017-11-14 10:51:22 +00:00
Guillaume Abrioux 44df3f9102 defaults: fix rgw restart script in handlers
Like 80d32dec, the path to the fact is not correct.
In any case, we will retrieve the IP address in hostvars, the variable
is the way we get the interface name according where it has been set
(eg.: inventory host file vs. group_vars/)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510906

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-13 16:30:03 +01:00
Sébastien Han 7b0743be52
Merge pull request #2144 from ceph/quick_fix_lvm
osd: skip some set_fact when osd_scenario=lvm
2017-11-13 21:50:37 +11:00
Guillaume Abrioux 238754a844 osd: skip some set_fact when osd_scenario=lvm
these tasks are not needed when using `osd_scenario: lvm`

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1509230

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-07 15:30:08 +01:00
Arano-kai 5cde3175ae FIX: run restart scripts in `noexec` /tmp
- One can not run scripts directly in place, that mounted with `noexec`
option. But one can run scripts as arguments for `bash/sh`.

Signed-off-by: Arano-kai <captcha.is.evil@gmail.com>
2017-11-06 16:02:47 +02:00
Sébastien Han 6ea92756c0 Merge pull request #2117 from ceph/rm-dup
default: remove dup variable
2017-10-27 13:49:52 +02:00
Sébastien Han d2575c7f5e default: remove dup variable
ceph_repository_type was declared multiple times. This commit fixes
this.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-27 11:46:15 +02:00
Sébastien Han d6a0d2f9be Merge pull request #2071 from jtaleric/master
Docker image pull retry
2017-10-27 09:49:03 +02:00
Sébastien Han 5a10b048b0 Merge pull request #2105 from major/really-fix-always-run
Really fix always run
2017-10-27 09:33:47 +02:00
Joe Talerico ab58764288 Docker image pull retry
This change sets a default timeout of 300s for the image pull. If the
image pull times out (300s), we will retry 3 times by default.

fixes 1954
2017-10-25 13:37:10 -04:00
Major Hayden f73232caa4
Use check_mode instead of always_run
This patch changes the `always_run: yes` task option to
`check_mode: no` to avoid Ansible warnings.
2017-10-25 09:53:34 -05:00
Major Hayden c2b5118c1b
Revert "Avoid deprecated always_run"
This reverts commit 620fb37dd4.
2017-10-25 09:48:09 -05:00
Andy McCrae 7f6c39102d Option to set TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
Use "ceph_tcmalloc_max_total_thread_cache" to set the
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES value inside /etc/default/ceph for
Debian installs, or /etc/sysconfig/ceph for Red Hat/CentOS installs.

By default this is set to 0, so the default package value will be used,
if specified this value will be changed to match the variable, and ceph
osd services will be restarted.
2017-10-25 14:38:36 +01:00
Sébastien Han 4413511b66 all: backward compatibility between stable-2.2 and 3.0
stable-3.0 brought numerous changes in ceph-ansible variables, this PR
aims to maintain backward compatibility for someone running stable-2.2
upgrading to stable-3.0 but keeps its groups_vars untouched.
We will then determine the right options to make sure the upgrade works
but we are expecting that new variables should be used.

We will drop this in a near future, maybe 3.1 or 3.2.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-20 11:54:10 +02:00
Sébastien Han c527515502 Merge pull request #2000 from ceph/merge-osd-scenarios
[skip ci] ci: new osd scenarios
2017-10-19 09:18:02 +02:00
Sébastien Han 1579f1c5b1 Merge pull request #2073 from ceph/fix_rbd_handler
[skip ci] rbd: fix restart script for jewel
2017-10-18 11:12:05 +02:00
Guillaume Abrioux c2850b11be rbd: fix restart script for jewel
In Jewel, we don't use bootstrap-rbd keyring for rbd-mirror nodes, it
results with a socket path/name different according to which ceph
release you are deploying.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-18 11:10:49 +02:00
Sébastien Han a53aa9e8b4 ci: new osd scenarios
This commit add new osd scenarios, it aims to simplify the CI setup and
brings a better coverage on the OSD scenarios.
We decided to differentiate between filestore and bluestore, thinking
ahead when filestore won't be supported anymore.
So we now have two classes of tests:

* Filestore
* Bluestore

In each of those classes we have container and non-container.
Then for each we test the following:

* collocated
* collocated dmcrypt
* non-collocated
* non-collocated dmcrypt
* auto discovery collocated
* auto discovery collocated dmcrypt

This gives us a nice coverage and also reduces the footprint on the CI.
We are now up to 4 scenarios, each containing 6 OSD VMs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-18 09:26:06 +02:00
Sébastien Han 90b75185d5 defaults: fix handlers for collocation
When doing collocation the condition "inventory_hostname in play_hosts"
is breaking the restart workflow.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-17 19:23:16 +02:00
Christian Berendt 4c380c9ef8 Cleanup readme files in roles directories
The contents of the README files are no longer up to date.
Documentation for all roles is located below the docs directory.
2017-10-17 11:22:06 +02:00
Major Hayden 620fb37dd4
Avoid deprecated always_run
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:

    [DEPRECATION WARNING]: always_run is deprecated.

This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
2017-10-12 08:29:44 -05:00
Sébastien Han 7054abef99 Merge pull request #2009 from ceph/fix-clean-pg
[skip ci] handler: do not test if pgs_num = 0
2017-10-07 03:39:26 +02:00
Sébastien Han 9f1bd3d6dd handler: add serial restart back
We now restart daemons on each machine in a serialized fashion.

Closes: https://github.com/ceph/ceph-ansible/issues/1989
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:39:10 +02:00
Sébastien Han ac29e8f977 Merge pull request #1983 from jprovaznik/suffix
Allow to override systemd service instance id
2017-10-06 22:40:57 +02:00
Sébastien Han 779f642fa8 use get to check stdout_lines
During the initial play, the docker command doesn't not exist and then
there is no stdout_lines to the command. So get allows us to fix this by
declaring an array if the command fails.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-06 16:57:46 +02:00
Sébastien Han d5ae0a3340 handler: do not test if pgs_num = 0
We don't need to wait if they are no PGS.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-06 16:57:46 +02:00
Jan Provaznik 3c16af5ef2 Allow to override systemd service instance id
It's useful to have constant service instance id when ceph-nfs
is managed by pacemaker.
2017-10-06 08:20:37 +02:00
Sébastien Han 425ecb3c7d common: fix ga verison for debian rhcs
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 18:45:30 +02:00
Sébastien Han 639389b9cd Merge pull request #1985 from ceph/debian-rhcs
[skip ci] common: fix rhcs installation on debian
2017-10-05 18:42:46 +02:00
Sébastien Han 9193e88878 common: fix rhcs installation on debian
* Change version from 2 to 3.
* use ceph_rhcs_cdn_debian_repo_version to use other repositories along
* with ceph_rhcs_cdn_debian_repo

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 17:42:21 +02:00
Sébastien Han b6b24a5ca9 iscsi: fix wrong group name for iscsi
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 17:25:32 +02:00
Sébastien Han cac7d034bf defaults: fix check socket non-container handler
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-04 15:33:52 +02:00
Sébastien Han 5968cf09b1 ci: add collocation scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-04 11:19:12 +02:00
Sébastien Han 0ce76113bf Merge pull request #1956 from ceph/osd-container-id
Osd container
2017-10-03 18:52:24 +02:00
Sébastien Han 3bd341f6c0 osd: container use id instead of dev name
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1494127
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:44:00 +02:00
Guillaume Abrioux 081f226106 defaults: change running order in main.yml
The task which sets `ceph_current_fsid` in `facts.yml` in case of containerized
deployment, will definitely fail because `docker_exec_cmd` is not set
yet.
This commits simply makes `facts.yml` played after `check_socket.yml` so
`docker_exec_cmd` is set properly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-02 18:42:43 +02:00