When creating pools, it's crucial to expose all the options available as
part of the pool creation command. As explained in:
http://docs.ceph.com/docs/jewel/rados/operations/pools/
Signed-off-by: Sébastien Han <seb@redhat.com>
Running the last portion (insert new default and add new default crush
tasks) of crush_rules.yml only on the last monitor is
wrong since ceph CLI calls usually end up on the master having the
quorum, which is by default the one with the lower IP.
So if we run the command and end up on another mon the creation will
happen on the default crush rule because the particular mon hasn't been
updated.
To fix this we remove the |last on the include and use run_once: true on
certain tasks, then we let the final two tasks run on all the monitors.
Signed-off-by: Sébastien Han <seb@redhat.com>
On releases after jewel the option
'osd_pool_default_crush_replicated_ruleset' does not exist anymore, it's
called osd_pool_default_crush_rule.
Signed-off-by: Sébastien Han <seb@redhat.com>
Instead of creating the CRUSH hierarchy with Ansible tasks using the
command module we now rely on the ceph_crush module.
Signed-off-by: Sébastien Han <seb@redhat.com>
One could want to add new crush rules while keeping his current default rule.
Fixed it so that it works with all rules defined as "default: false". If multiple rules are defined as default (should not be) then the last rule listed in "crush_rules" is taken as default.
Previously it was necessary to provide a value (eventually an
empty string) for the "rule_name" key for each item in
openstack_pools. This change makes that optional and defaults to
empty string when not given.
Since Luminous we need to set the application tag for each pool,
otherwise a CEPH_WARNING is generated when the pools are in use.
We should assign the OpenStack pools to their default which would be
"rbd". When updating to Luminous this would happen automatically to the
vms, images, backups and volumes pools, but for new deploys this is not
the case.
While hostname -f will always return an hostname including its
domain part and -s without the domain part, the behavior when
no arguments are given can include or not include the domain part
depending on how the system is configured; the socket name might
not match the instance name then.
Was called too early, container was not yet started so the commands failed.
Moved the section after include docker/main.yml
Signed-off-by: Greg Charot <gcharot@redhat.com>
Use a nicer syntax for `local_action` tasks.
We used to have oneliner like this:
```
local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }}
```
The usual syntax:
```
local_action:
module: wait_for
port: 22
host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
state: started
delay: 10
timeout: 500
```
is nicer and kind of way to keep consistency regarding the whole
playbook.
This also fix a potential issue about missing quotation :
```
Traceback (most recent call last):
File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module>
main()
File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main
rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin)
File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command
File "/usr/lib64/python2.7/shlex.py", line 279, in split
return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next
token = self.get_token()
File "/usr/lib64/python2.7/shlex.py", line 96, in get_token
raw = self.read_token()
File "/usr/lib64/python2.7/shlex.py", line 172, in read_token
raise ValueError, "No closing quotation"
ValueError: No closing quotation
```
writing `local_action: shell echo {{ fsid }} | tee {{ fetch_directory }}/ceph_cluster_uuid.conf`
can cause trouble because it's complaining with missing quotes, this fix solves this issue.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
With two public networks configured - we found that with
"NETWORK_ADDR_1, NETWORK_ADDR_2" install process consistently became
broken, trying to find docker registry on second network, and not
finding mon container.
but without spaces
"NETWORK_ADDR_1,NETWORK_ADDR_2" install succeeds
so, containerized install is more peculiar with formatting of this line
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1534003
Signed-off-by: Sébastien Han <seb@redhat.com>
Currently, we can define crush location for each host but only crush roots and crush rules are created. This commit automates other routines for a complete solution:
1) Creates rack type crush buckets defined in {{ ceph_crush_rack }} of each osd host. If it's not defined by user then a rack named 'default_rack_{{ ceph_crush_root }}' would be added and used in next steps.
2) Move rack type crush buckets defined in {{ ceph_crush_rack }} into crush roots defined in {{ ceph_crush_root }} of each osd host.
3) Move hosts defined in {{ ceph_crush_rack }} into crush roots defined in {{ ceph_crush_root }} of each osd host.
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
There is no reasons why we can't use crush rules when deploying
containers. So moving the inlcude in the main.yml so it can be called.
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-create-keys is idempotent so it's not an issue to run it each time
we play ansible. This also fix issues where the 'creates' arg skips the
task and no keys get generated on newer version, e.g during an upgrade.
Closes: https://github.com/ceph/ceph-ansible/issues/2228
Signed-off-by: Sébastien Han <seb@redhat.com>
The name docker_version is very generic and is also used by other
roles. As a result, there may be name conflicts. To avoid this a
ceph_ prefix should be used for this fact. Since it is an internal
fact renaming is not a problem.
If a deployer uses an interface name with a dash/hyphen in it, such
as 'br-storage' for the monitor_interface group_var, the ceph.conf.j2
template fails to find the right facts. It looks for
'ansible_br-storage' but only 'ansible_br_storage' exists.
This patch converts the interface name to underscores when the
template does the fact lookup.
A recent change [1] required that the openstack_keys
param always containe an acls list. However, it's
possible it might not contain that list. Thus, this
param sets a default for that list to be empty if it
is not in the structure as defined by the user.
[1] d65cbaa539
If ceph-ansible deploys a Ceph cluster with "openstack_config: true"
and sets the openstack_keys map to have certain ACLs or permissions,
the requested ACLs or permissions are only set on one of the monitor
nodes [2] when they should be set on all of them.
This patch solves [3] the above issue by having the chmod and setfacl
tasks iterate the list of mon nodes (including the mon node that the
task was delegated to) to apply the chmod of setfacl to the keys in
openstack_keys.
[1]
```
openstack_keys:
- { name: client.openstack, key: "$(ceph-authtool --gen-print-key)", mon_cap: "allow r", osd_cap: "allow class-read object_prefix rbd_children, allow rwx pool=images, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=backups", mode: "0600", acls: ["u:nova:r--", "u:cinder:r--", "u:glance:r--", "u:gnocchi:r--"] }
```
[2]
```
$ ansible mons -m shell -b -a "ls -l /etc/ceph/ceph.client.openstack.keyring ; getfacl /etc/ceph/ceph.client.openstack.keyring"
192.168.1.26 | SUCCESS | rc=0 >>
-rw-r-----+ 1 root root 253 Nov 3 20:30 /etc/ceph/ceph.client.openstack.keyring
user::rw-
user:glance:r--
user:nova:r--
user:cinder:r--
user:gnocchi:r--
group::---
mask::r--
other::---getfacl: Removing leading '/' from absolute path names
192.168.1.29 | SUCCESS | rc=0 >>
-rw-r--r--. 1 root root 253 Nov 3 20:30 /etc/ceph/ceph.client.openstack.keyring
user::rw-
group::r--
other::r--getfacl: Removing leading '/' from absolute path names
192.168.1.23 | SUCCESS | rc=0 >>
-rw-r--r--. 1 root root 253 Nov 3 20:30 /etc/ceph/ceph.client.openstack.keyring
user::rw-
group::r--
other::r--getfacl: Removing leading '/' from absolute path names
$
```
[3]
```
(undercloud) [stack@hci-director ceph-ansible]$ ansible mons -m shell -b -a "ls -l /etc/ceph/ceph.client.openstack.keyring ; getfacl /etc/ceph/ceph.client.openstack.keyring"
192.168.1.25 | SUCCESS | rc=0 >>
-rw-r-----+ 1 root root 253 Nov 14 01:12 /etc/ceph/ceph.client.openstack.keyring
user::rw-
user:glance:r--
user:nova:r--
user:cinder:r--
user:gnocchi:r--
group::---
mask::r--
other::---getfacl: Removing leading '/' from absolute path names
192.168.1.29 | SUCCESS | rc=0 >>
-rw-r-----+ 1 root root 253 Nov 14 01:12 /etc/ceph/ceph.client.openstack.keyring
user::rw-
user:glance:r--
user:nova:r--
user:cinder:r--
user:gnocchi:r--
group::---
mask::r--
other::---getfacl: Removing leading '/' from absolute path names
192.168.1.27 | SUCCESS | rc=0 >>
-rw-r-----+ 1 root root 253 Nov 14 01:12 /etc/ceph/ceph.client.openstack.keyring
user::rw-
user:glance:r--
user:nova:r--
user:cinder:r--
user:gnocchi:r--
group::---
mask::r--
other::---getfacl: Removing leading '/' from absolute path names
(undercloud) [stack@hci-director ceph-ansible]$
```
The path to the fact is not correct.
In any case, we will retrieve the IP address in hostvars, the variable
is the way we get the interface name according where it has been set
(eg.: inventory host file vs. group_vars/)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510906
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Setting monitor_interface in group_vars/all.yml makes the
hostvars[host]['monitor_interface'] non-existing.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1507922
Signed-off-by: Sébastien Han <seb@redhat.com>
Only chmod or setfacl the requested keyring(s) in the
opentack_keys data structure when the mode or acls keys
of that data structure exist.
User may specify four permission combinations for the
keyring file(s): 1. only set ACL, 2. only set mode,
3. set neither mode nor ACL, 4. set mode and then ACL.
Fixes: #2092
stable-3.0 brought numerous changes in ceph-ansible variables, this PR
aims to maintain backward compatibility for someone running stable-2.2
upgrading to stable-3.0 but keeps its groups_vars untouched.
We will then determine the right options to make sure the upgrade works
but we are expecting that new variables should be used.
We will drop this in a near future, maybe 3.1 or 3.2.
Signed-off-by: Sébastien Han <seb@redhat.com>
This will solve the following issue when starting docker containers on ubuntu:
invalid argument "1\u00a0" for --cpus=1 : failed to parse 1 as a rational number
Closes-bug: #2056
1. add the variables to docker_collocation
2. trigger the check when a MDS is part of the inventory file, not when
we run on an MDS...
Signed-off-by: Sébastien Han <seb@redhat.com>
This patch changes the `when:` keys so that they have no jinja2
delimiters. This avoids Ansible warnings which could turn into
errors in a future Ansible release.
We now have a variable called ceph_pools that is mandatory when
deploying a MDS.
It's a dictionnary that contains a pool name and a PG count. PG count is
mandatory and must be set, the playbook will fail otherwise.
Closes: https://github.com/ceph/ceph-ansible/issues/2017
Signed-off-by: Sébastien Han <seb@redhat.com>
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:
[DEPRECATION WARNING]: always_run is deprecated.
This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
The value of doing this is fairly low compare to the added value.
So we remove these tasks, if rbd pool on Jewel doesn't have the right PG
value you can always increase it.
Signed-off-by: Sébastien Han <seb@redhat.com>
All keyring are getting copied to all nodes.
This commit fixes a leftover from a previous code refactor.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498583
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>