If someone's cluster name is 'ceph' then the playbook will fail (with no
errors because of ignore_errors) saying it can not find the variable. So
let's declare the default. If the cluster name is different then it'll
be in group_vars and thus there won't be any failre.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636962
Signed-off-by: Sébastien Han <seb@redhat.com>
The role contains all the handlers for Ceph services. We decided to
leave ceph-defaults role with variables and a few facts only. This is
useful when organizing the site.yml files and also adding the known
variables to infrastructure-playbooks.
Signed-off-by: Sébastien Han <seb@redhat.com>
`mon_group_name` isn't defined here, we must hardcode it.
Typical error:
```
The task includes an option with an undefined variable. The error was: 'mon_group_name' is undefined
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Same problem again... ceph_release_num[ceph_release] is only set in
ceph-docker-common/common roles so putting the condition on that role
will never work. Removing the condition.
The downside of this is we will be installing packages and then skip the
role on the node.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622210
Signed-off-by: Sébastien Han <seb@redhat.com>
If we play site-docker.yml, we are already in a
containerized_deployment. So the condition is not needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
Recently we renamed the group_name for iscsi iscsigws where previously
it was named iscsi-gws. Existing deployments with a host file section
with iscsi-gws must continue to work.
This commit adds the old group name as a backoward compatility, no error
from Ansible should be expected, if the hostgroup is not found nothing
is played.
Close: https://bugzilla.redhat.com/show_bug.cgi?id=1619167
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-rest-api binary has been removed in mimic so we cannot deploy it
anymore. We just keep the role and the compability for existing users.
Signed-off-by: Sébastien Han <seb@redhat.com>
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.
Fixes: #2553
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
get a non empty array as default value for `groups.get('clients')`,
otherwise `| first` filter will complain because it can't work with
empty array.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit aims to set the default behavior to play
`ceph-docker-common` only on first node in clients group.
Currently, we play docker-common to pull container image so we can run
ceph commands in order to generate keys or create pools.
On a cluster with a large number of client nodes this can be time consuming
to proceed this way. An alternative would be to pull container image
only a first node and then copy keys on other nodes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit is a workaround for
https://bugzilla.redhat.com/show_bug.cgi?id=1550977
We iterate over all nodes on each node and we delegate the facts gathering.
This is high memory consuming when having a large number of nodes in the
inventory.
That way of gathering is not necessary for clients node so we can simply
gather local facts for these nodes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This was taken from the openshift ansible repository here:
https://github.com/leseb/openshift-ansible/tree/master/roles/installer_checkpoint
Rationale:
A complete OpenShift cluster installation is comprised of many different
components which can take 30 minutes to several hours to complete. If
the installation should fail, it could be confusing to understand at
which component the failure occurred. Additionally, it may be desired to
re-run only the component which failed instead of starting over from the
beginning. Components which came after the failed component would also
need to be run individually.
Ceph has a similar situation so we can benefit from that
callback_plugin.
Signed-off-by: Sébastien Han <seb@redhat.com>
This fact is already set in site-docker.yml so there's no need to check
it again in ceph-docker-common
Signed-off-by: Paul Bourke <paul.bourke@oracle.com>
Now by running the playbook like this:
ansible-playbook site.yml --tags='ceph_update_config'
You can only generate a ceph configuration file on the nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1543434
Signed-off-by: Sébastien Han <seb@redhat.com>
When deploying with Ansible at large scale, the delegate_facts method
consumes a lot of memory on the host that is running Ansible. This can
cause various issues like memory exhaustion on that machine.
You can now run Ansible with "-e delegate_facts_host=False" to disable
the fact sharing.
Signed-off-by: Sébastien Han <seb@redhat.com>
The container deployment is serialized, adding this task as a best
effort. If docker is already present we pull the image otherwise we wait
for the role to play.
Signed-off-by: Sébastien Han <seb@redhat.com>
This patch changes the `when:` keys so that they have no jinja2
delimiters. This avoids Ansible warnings which could turn into
errors in a future Ansible release.
So we can later evaluate the conditions.
Also fix the variable, we are comparing ceph_release not
ceph_stable_release
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1486062
Signed-off-by: Sébastien Han <seb@redhat.com>
we don't know ceph_stable_release before executing the role so at least
we need to run ceph-defaults and ceph-docker-common or
ceph-common.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1486062
Signed-off-by: Sébastien Han <seb@redhat.com>
If we don't bootstrap the mgr after the mon and the osds handler are
called, we will never be able to reach a clean state since the pgs
stats are handled by the mgr. This also happens when doing daemon
collocation.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920
Signed-off-by: Sébastien Han <seb@redhat.com>
On a container env, machines don't have any ceph binaries so we need to
use a container to run the commands.
Signed-off-by: Sébastien Han <seb@redhat.com>
We must mask the image so we are sure that even if the system reboots
then the OSDs won't start.
Also remove Ceph udev rules if found on the system prior to deploy
containers. If we don't do this we are exposed to conflicts between udev
rules and sytemd unit files.
Also add the CI will now test the migration from a non-containerized cluster to a
containerized cluster.
Signed-off-by: Sébastien Han <seb@redhat.com>
Now we can use --limit on the container deployment too. This is useful
while deploying client nodes.
e.g: ansible-playbook -i inventory -l clients site-docker.yml.sample
Signed-off-by: Sébastien Han <seb@redhat.com>
This will give us more flexibility and the possibility to deploy a client node
for an external ceph-cluster.
related BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1469426Fixes: #1670
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This will give us more flexibility and avoid a lot of useless when
skipping all tasks from a non-desired role.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Without this, we don't test the mgr role so we need to add it.
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.
Only works as of the Kraken release.
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
- Gather facts only for mons before processing ceph-mon role serially in
containerized playbook sample
- Updated ceph.conf in order to generate a valid ceph.conf
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Ceph has the ability to export it's filesystem via NFS using Ganesha.
Add a ceph-nfs role that will start Ganesha and export the Ceph
filesystems.
Note that, although support is going in to export RGW via NFS, this is
not working yet.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
run containerized daemons in virtual machines.
to enable it simply do:
`cp site-docker.yml.sample site-docker.yml`
and set `docker: true` in `vagrant_variables.yml`
Signed-off-by: Sébastien Han <seb@redhat.com>