Regardless if the partition is 'ceph' or something else, we don't want
to be as strick as checking for a particular partition.
If the drive has a partition, we just don't do anything.
This solves the case where the server reboots, disks get a different
/dev/sda (node) allocation. In this case, prior to restarting the server
/dev/sda was an OSD, but now it's /dev/sdb and the other way around.
In such scenario, we will try to prepare the OSD and create a new
partition, so let's not mess around with devices that have partitions.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303
Signed-off-by: Sébastien Han <seb@redhat.com>
in case of multimds we must check for the number of mds up instead of
just checking if the hostname of the node is in the fsmap.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
allow_multimds will be officially deprecated in Mimic, specify it
only for all versions of Ceph where it was declared stable. Going
forward, specify only max_mds.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
NFS-ganesha cannot start is the nfs-server service
is running. This commit stops nfs-server in case it
is running on a (debian, redhat, suse) node before
the nfs-ganesha service starts up
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Add a variable, ceph_nfs_disable_caching, that if set to true
disables ganesha's directory and attribute caching as much as
possible.
Also, disable caching done by ganesha, when 'nfs_file_gw'
variable is true, i.e., when Ganesha is used as CephFS's gateway.
This is the recommended Ganesha setting as libcephfs already caches
information. And doing so helps avoid cache incoherency issues
especially with clustered ganesha over CephFS.
Fixes: https://tracker.ceph.com/issues/23393
Signed-off-by: Ramana Raja <rraja@redhat.com>
If people keep on using the mon_cap, osd_cap etc the playbook will
translate this old syntax on the flight.
Signed-off-by: Sébastien Han <seb@redhat.com>
These are already handled by ceph-client/defaults/main.yml so the keys
will be created once user_config is set to True.
Signed-off-by: Sébastien Han <seb@redhat.com>
This changes state to action and gives the options 'create'
or 'zap'. The zap parameter is also removed.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Because we have many commands we might need to run the
ANSIBLE_STDOUT_CALLBACK won't format these nicely because we're
not reporting these back at the root level of the json result.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
I want a default value of 'present' for state, so it can not
be made required. Othewise it'll throw a 'Module alias error'
from ansible.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Now that we are using ceph_volume_zap the partitions are
kept around and should be able to be reused.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This really isn't needed currently and I don't believe is a good
mechanism for switching subcommands anwyay. The user of this module
should not have to be familar with all ceph-volume subcommands.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Added 'ignore_errors: true' to multiple lines which run docker commands; even in cases where docker is no longer installed. Because of this, certain tasks in the purge-docker-cluster.yml will cause the playbook to fail if re-run and stop the purge. This leaves behind a dirty environment, and a playbook which can no longer be run.
Fix Regex line 275: Sometimes 'list-units' will output 4 spaces between loaded+active. The update will account for both scenarios.
purge fetch_directory: in other roles fetch_directory is hard linked ex.: "{{ fetch_directory }}"/"{{ somedir }}". That being said, fetch_directory will never have a trailing slash in the all.yml so this task was never being run(causing failures when trying to re-deploy).
Signed-off-by: Randy J. Martinez <ramartin@redhat.com>
The path of ceph.conf sample template moved to ceph-config.
Therefore docs needs to be changed to the right directory.
Signed-off-by: JohnHaan <yongiman@gmail.com>
backward compatibility with `ceph_mon_docker_interface` and
`ceph_mon_docker_subnet` was not working since there wasn't lookup on
`monitor_interface` and `public_network`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Prior to this change, if a user had ceph-test-12.2.1 installed, and
upgraded to ceph v12.2.3 or newer, the RPM upgrade process would
fail.
The problem is that the ceph-test RPM did not depend on an exact version
of ceph-common until v12.2.3.
In Ceph v12.2.3, ceph-{osdomap,kvstore,monstore}-tool binaries moved
from ceph-test into ceph-base. When ceph-test is not yet up-to-date, Yum
encounters package conflicts between the older ceph-test and newer
ceph-base.
When all users have upgraded beyond Ceph < 12.2.3, this is no longer
relevant.
According to our recent change, we now use "CentOS" as a latest
container image. We need to reflect this on the ceph_uid.
Signed-off-by: Sébastien Han <seb@redhat.com>
Currently tag-build-master-luminous-ubuntu-16.04 is not used anymore.
Also now, 'latest' points to CentOS so we need to make that switch here
too.
We know have latest tags for each stable release so let's use them and
point tox at them to deploy the right version.
Signed-off-by: Sébastien Han <seb@redhat.com>
get a non empty array as default value for `groups.get('clients')`,
otherwise `| first` filter will complain because it can't work with
empty array.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
I personally dev on vscode and I have some preferences to save when it
comes to running the python unit tests. So escaping this directory is
actually useful.
Signed-off-by: Sébastien Han <seb@redhat.com>
Tripleo deployment failed when the monitors not manged
by tripleo itself with:
FAILED! => {"msg": "list object has no element 0"}
The failing play item was introduced by
f46217b69a .
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1552327
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
because of `serial: 1`, it can be an issue when the playbook is being
run on client nodes.
Since the refact of `ceph-client` we skip the role `ceph-defaults` on
every node except the first client node, it means that the task is not
going to be played because of `run_once: true`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit refacts this role so we don't have to pull container image
on client nodes just to create pools and keys.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1550977
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This seems to be a leftover.
This commit removes an unnecessary 'set linux permissions' on
`/var/lib/ceph`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit aims to set the default behavior to play
`ceph-docker-common` only on first node in clients group.
Currently, we play docker-common to pull container image so we can run
ceph commands in order to generate keys or create pools.
On a cluster with a large number of client nodes this can be time consuming
to proceed this way. An alternative would be to pull container image
only a first node and then copy keys on other nodes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This check is alone in `ceph-docker-common` since a previous code
refactor.
Moving this check in `ceph-defaults` allows us to run `ceph-clients`
without having to run `ceph-docker-common` even in non-containerized
deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Prior to this patch, the certificates where being generated on a single
node only (because of the run_once: true). Thus certificates were not
distributed on all the gateway nodes.
This would require a second ansible run to work. This patches fix the
creation and keys's distribution on all the nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540845
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit is a workaround for
https://bugzilla.redhat.com/show_bug.cgi?id=1550977
We iterate over all nodes on each node and we delegate the facts gathering.
This is high memory consuming when having a large number of nodes in the
inventory.
That way of gathering is not necessary for clients node so we can simply
gather local facts for these nodes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The `remove_packages` prompt is redundant to the `ireallymeanit` prompt
since it does exactly the same thing. I guess the only goal of this task
was to make a break to warn user about `--skip-tags=with_pkg` feature.
This warning should be part of the first prompt.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This update will resolve error['cephfs' is undefined.] in multimds container deployments.
See: roles/ceph-mon/tasks/create_mds_filesystems.yml. The same last two tasks are present there, and actully need to happen in that role since "{{ cephfs }}" gets defined in
roles/ceph-mon/defaults/main.yml, and not roles/ceph-mds/defaults/main.yml.
Signed-off-by: Randy J. Martinez <ramartin@redhat.com>