Commit Graph

1770 Commits (9a9aa2479f8f006bc611490273209a4dd7e4bdab)

Author SHA1 Message Date
Guillaume Abrioux f60b049ae5 client: remove default value for pg_num in pools creation
trying to set the default value for pg_num to
`hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num'])` will
break in case of external client nodes deployment.
the `pg_num` attribute should be mandatory and be tested in future
`ceph-validate` role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-10 11:51:02 -07:00
Gregory Meno 26f6a65042 adds missing state needed to upgrade nfs-ganesha
in tasks for os_family Red Hat we were missing this

fixes: bz1575859
Signed-off-by: Gregory Meno <gmeno@redhat.com>
2018-05-09 19:58:04 +00:00
Guillaume Abrioux 259fae931d mon: fix mgr keyring creation when upgrading from jewel
On containerized deployment,
when upgrading from jewel to luminous, mgr keyring creation fails because the
command to create mgr keyring is executed on a container that is still
running jewel since the container is restarted later to run the new
image, therefore, it fails with bad entity error.

To get around this situation, we can delegate the command to create
these keyrings on the first monitor when we are running the playbook on the last monitor.
That way we ensure we will issue the command on a container that has
been well restarted with the new image.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-09 10:29:48 -07:00
Guillaume Abrioux 7b387b506a osd: clean legacy syntax in ceph-osd-run.sh.j2
Quick clean on a legacy syntax due to e0a264c7e

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-09 07:29:33 +02:00
Simone Caronni b12bf62c36 Make sure the restart_mds_daemon script is created with the correct MDS name 2018-05-08 20:53:15 +02:00
Sébastien Han 07ca91b5cb common: enable Tools repo for rhcs clients
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574458
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-05-08 16:12:30 +02:00
Andy McCrae e99351b95b Fix install of nfs-ganesha-ceph for Debian/SuSE
The Debian and SuSE installs for nfs-ganesha on the non-rhcs repository
requires you to allow_unauthenticated for Debian, and disable_gpg_check
for SuSE. The nfs-ganesha-rgw package already does this, but the
nfs-ganesha-ceph package will fail to install because of this same
issue.

This PR moves the installations to happen when the appropriate flags are
set to True (nfs_obj_gw & nfs_file_gw), but does it per distro (one for
SuSE and one for Debian) so that the appropriate flag can be passed to
ignore the GPG check.
2018-05-04 15:13:59 +02:00
Ramana Raja 31762dede3 ceph-nfs: disable attribute caching
When 'ceph_nfs_disable_caching' is set to True, disable attribute
caching done by Ganesha for all Ganesha exports.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2018-05-04 09:47:54 +02:00
Sébastien Han 4a186237e6 common: copy iso files if rolling_update
If we are in a middle of an update we want to get the new package
version being installed so the task that copies the repo files should
not be skipped.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572032
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-05-03 17:18:55 +02:00
Andy McCrae d142be0422 Move apt cache update to individual task per role
The apt-cache update can fail due to transient issues related to the
action being a network operation. To reduce the impact of these
transient failures this patch adds a retry to the update_cache task.

However, the apt_repository tasks which would perform an apt_update
won't retry the apt_update on a failure in the same way, as such this PR
moves the apt_update into an individual task, once per role.

Finally, the apt_repository tasks no longer have a changed_when: false,
and the apt_cache update is only performed once per role, if the
repositories change. Otherwise the cache is updated on the "apt" install
tasks if the cache_timeout has been reached.
2018-05-03 14:02:15 +02:00
Guillaume Abrioux 6fe8df627b client: fix pool creation
the value in `docker_exec_client_cmd` doesn't allow to check for
existing pools because it's set with a wrong value for the entrypoint
that is going to be used.
It means the check were going to fail anyway even if pools actually exist.

Using jinja syntax to set `docker_exec_cmd` allows to handle the case
where you don't have monitors in your inventory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-03 08:22:40 +02:00
Sébastien Han 43e23ffe4d mon: change application pool support
If openstack_pools contains an application key it will be used to apply
this application pool type to a pool.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1562220
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-30 09:42:58 +02:00
Guillaume Abrioux 75ed437d4e check if pools already exist before creating them
Add a task to check if pools already exist before we create them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-30 08:15:18 +02:00
Guillaume Abrioux a68091c923 tests: update the type for the rule used in pools
As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but
an id, therefore an `int` is expected otherwise it will fail with the
following error

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-30 08:15:18 +02:00
Sébastien Han 12eebc31fb mon/client: honor key mode when copying it to other nodes
The last mon creates the keys with a particular mode, while copying them
to the other mons (first and second) we must re-use the mode that was
set.

The same applies for the client node, the slurp preserves the initial
'item' so we can get the mode for the copy.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 18:34:58 +02:00
Sébastien Han 74494253fa mon: remove redundant copy task
We had twice the same task, also one was overriding the mode.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 18:34:58 +02:00
Sébastien Han 85732d11b9 mon/client: remove acl code
Applying ACL on the keyrings is not used anymore so let's remove this
code.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 18:34:58 +02:00
Sébastien Han cfe8e51d99 mon/client: apply mode from ceph_key
Do not use a dedicated task for this but use the ceph_key module
capability to set file mode.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 18:34:58 +02:00
Di Xu 113eb25424 add AArch64 to supported architecture
works on AArch64 platform
2018-04-23 10:23:21 +02:00
Sébastien Han 949507d304 mon: remove mgr key from ceph_config_keys
This key is created after the last mon is up so there is no need to try
to push it from the first mon. The initia mon container is not creating
the mgr key, ansible does. So this key will never exist.
The key will go into the fetch dir once the last mon is up, then when
the ceph-mgr plays it will try to get it from the fetch directory.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 10:17:24 +02:00
Sébastien Han 35c1eb7183 mon: remove mon map from ceph_config_keys
During the initial bootstrap of the first mon, the monmap file is
destroyed so it's not available and ansible will never find it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 10:17:24 +02:00
Sébastien Han 65ba85aff6 Expose /var/run/ceph
Useful for softwares that do data collection/monitoring like collectd.
They can connect to the socket and then retrieve information.

Even though the sockets are exposed now, I'm keeping the docker exec to
check the socket, this will allow newer version of ceph-ansible to work
with older versions.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-20 15:48:32 +02:00
Sébastien Han bf1e70e8cf default: extent ceph_uid and gid
We now have the ability to detect the uid/gid of the ceph user depending
on the distribution we are running on and so we are doing non-container
deployements.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-20 15:48:32 +02:00
Sébastien Han f3656ad167 move create ceph initial directories to default
This is needed for both non-container and container deployments.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-20 15:48:32 +02:00
Sébastien Han 641f141c0f selinux: remove chcon calls
We know bindmount with the :z option at the end of the -v command so
this will basically run the exact same command as we used to run. So to
speak:

chcon -Rt svirt_sandbox_file_t /var/lib/ceph

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-19 14:59:37 +02:00
Sébastien Han 90e47c5fb0 client: add a --rm option to run the container
This fixes the case where the playbook died and never removed the
container. So now, once the container exits it will remove itself from
the container list.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-19 14:59:37 +02:00
Sébastien Han 6c742376fd client: import the key in ceph is copy_admin_key is true
If the user has set copy_admin_key to true we assume he/she wants to
import the key in Ceph and not only create the key on the filesystem.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-18 17:46:54 +02:00
Sébastien Han 424815501a client: add quotes to the dict values
ceph-authtool does not support raw arguements so we have to quote caps
declaration like this allow 'bla bla' instead of allow bla bla

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-18 17:46:54 +02:00
Sébastien Han d2a2793cb0 refactor the way we copy keys
This commit does a couple of things:

* use a common.yml file that contains things that can be played on both
container and non-container

* refactor the ability to copy the admin key to the nodes

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-18 16:46:33 +02:00
Randy J. Martinez 127a643fd0 ceph-defaults: fix ceph_uid fact on container deployments
Red Hat is now using tags[3,latest] for image rhceph/rhceph-3-rhel7.
Because of this, the ceph_uid conditional passes for Debian
when 'ceph_docker_image_tag: latest' on RH deployments.
I've added an additional task to check for rhceph image specifically,
and also updated the RH family task for ceph/daemon [centos|fedora]tags.

Signed-off-by: Randy J. Martinez <ramartin@redhat.com>
2018-04-17 16:54:51 +02:00
Sébastien Han a98885a71e rhcs: re-add apt-pining
When installing rhcs on Debian systems the red hat repos must have the
highest priority so we avoid packages conflicts and install the rhcs
version.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1565850
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-17 16:07:06 +02:00
Guillaume Abrioux 899b0eb451 defaults: check only 1 time if there is a running cluster
There is no need to check for a running cluster n*nodes time in
`ceph-defaults` so let's add a `run_once: true` to save some resources
and time.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-16 11:23:00 +02:00
Sébastien Han 5bbbce527e osd: do not do anything if the dev has a partition
Regardless if the partition is 'ceph' or something else, we don't want
to be as strick as checking for a particular partition.
If the drive has a partition, we just don't do anything.

This solves the case where the server reboots, disks get a different
/dev/sda (node) allocation. In this case, prior to restarting the server
/dev/sda was an OSD, but now it's /dev/sdb and the other way around.
In such scenario, we will try to prepare the OSD and create a new
partition, so let's not mess around with devices that have partitions.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-13 19:11:15 +02:00
Sébastien Han 37117071eb common: add tools repo for iscsi gw
To install iscsi gw packages we need to enable the tools repo.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1547849
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-12 13:38:34 +02:00
Douglas Fuller c8573fe0d7 Remove deprecated allow_multimds
allow_multimds will be officially deprecated in Mimic, specify it
only for all versions of Ceph where it was declared stable. Going
forward, specify only max_mds.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2018-04-12 10:29:17 +02:00
vasishta p shastry 020e66c1b4 Fixed a typo (extra space) 2018-04-11 14:21:15 +02:00
vasishta p shastry e1a1f81b6f osd: to support copy_admin_key 2018-04-11 14:21:15 +02:00
vasishta p shastry db3a5ce6d9 mds: to support copy_admin_keyring 2018-04-11 14:21:15 +02:00
vasishta p shastry 6b59416f75 nfs: to support copy_admin_key - containerized 2018-04-11 14:21:15 +02:00
Ali Maredia 01c58695fc nfs: ensure nfs-server server is stopped
NFS-ganesha cannot start is the nfs-server service
is running. This commit stops nfs-server in case it
is running on a (debian, redhat, suse) node before
the nfs-ganesha service starts up

fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2018-04-11 14:00:48 +02:00
Ramana Raja 4a430ae29a ceph-nfs: allow disabling ganesha caching
Add a variable, ceph_nfs_disable_caching, that if set to true
disables ganesha's directory and attribute caching as much as
possible.

Also, disable caching done by ganesha, when 'nfs_file_gw'
variable is true, i.e., when Ganesha is used as CephFS's gateway.
This is the recommended Ganesha setting as libcephfs already caches
information. And doing so helps avoid cache incoherency issues
especially with clustered ganesha over CephFS.

Fixes: https://tracker.ceph.com/issues/23393

Signed-off-by: Ramana Raja <rraja@redhat.com>
2018-04-11 13:56:40 +02:00
Sébastien Han 82ccbdafbc ceph-defaults: bring backward compatibility for old syntax
If people keep on using the mon_cap, osd_cap etc the playbook will
translate this old syntax on the flight.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-11 12:18:34 +02:00
Sébastien Han 9657e4d6fa ceph_key: use ceph_key in the playbook
Replaced all the occurence of raw command using the 'command' module
with the ceph_key module instead.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-11 12:18:34 +02:00
Guillaume Abrioux 66c4118dcd defaults: fix backward compatibility
backward compatibility with `ceph_mon_docker_interface` and
`ceph_mon_docker_subnet` was not working since there wasn't lookup on
`monitor_interface` and `public_network`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-10 00:19:11 +02:00
Ken Dreyer 3752cc6f38 common: upgrade/install ceph-test RPM first
Prior to this change, if a user had ceph-test-12.2.1 installed, and
upgraded to ceph v12.2.3 or newer, the RPM upgrade process would
fail.

The problem is that the ceph-test RPM did not depend on an exact version
of ceph-common until v12.2.3.

In Ceph v12.2.3, ceph-{osdomap,kvstore,monstore}-tool binaries moved
from ceph-test into ceph-base. When ceph-test is not yet up-to-date, Yum
encounters package conflicts between the older ceph-test and newer
ceph-base.

When all users have upgraded beyond Ceph < 12.2.3, this is no longer
relevant.
2018-04-09 18:09:52 +02:00
Sébastien Han bb60f2fea4 ceph-defaults: fix ceoh_uid for container image tag latest
According to our recent change, we now use "CentOS" as a latest
container image. We need to reflect this on the ceph_uid.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-09 13:54:55 +02:00
Zack Cerza 0123d790cd Use the CentOS repo for Red Hat dev packages
No use even trying to use something that doesn't exist.

Signed-off-by: Zack Cerza <zack@redhat.com>
2018-04-09 10:05:57 +02:00
Attila Fazekas ecd3563c21 Deploying without managed monitors failed
Tripleo deployment failed when the monitors not manged
by tripleo itself with:
    FAILED! => {"msg": "list object has no element 0"}

The failing play item was introduced by
 f46217b69a .

fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1552327

Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2018-04-04 18:16:46 +02:00
Guillaume Abrioux dcf6a246a4 defaults: remove `run_once: true` when creating fetch_directory
because of `serial: 1`, it can be an issue when the playbook is being
run on client nodes.
Since the refact of `ceph-client` we skip the role `ceph-defaults` on
every node except the first client node, it means that the task is not
going to be played because of `run_once: true`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-04 10:51:17 +02:00
Guillaume Abrioux 18c0c7a508 config: use fact `ceph_uid`
Use fact `ceph_uid` in the task which ensures `/etc/ceph` exists in
containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-04 10:51:17 +02:00