It is not enough to check for the mds to exists, it actually always does
because we declare the variable. So we need to make sure that there is a
mds host.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we introduced config_overrides we removed a lot of options from
the default template. In some cases, like mds pool, openstack pools etc
we need to know the amount of PGs required. The idea here is to skip the
task if ceph_conf_overrides.global.osd_pool_default_pg_num is not define
in your `group_vars/all.yml`.
Closes: #1145
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
This allows for the role to be used with ansible-galaxy and to fix the
include in all the meta/main.yml files in the roles.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The libcephfs1 package was removed from ceph-common in
cb1c06901e, however it was not synced
to group_vars/all.yml.sample using the `generate_group_vars_sample.sh`
script. This fixes up the comment formatting in the ceph-common
defaults and brings the group_vars sample back into sync.
Prior to this change, a playbook run with '--tags' or '--skip-tags'
would fail, because the ceph-common role would not include the
release.yml task, and this file defines critical things like
ceph_release.
Thanks Andrew Schoen <aschoen@redhat.com> for help with the fix.
Task put initial mon keyring in mon kv store from
ceph-mon/tasks/ceph_keys.yml is failing when cephx is disabled. The root
cause is that variable monitor_keyring is not populated by any task from
deploy_monitors.yml.
Fixes: #1211
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this patch we had several ways to runs containers, we could use
ansible's docker module on some distro and on containers distros we were
using systemd. We strongly believe threating containers as services with
systemd is the right approach so this patch generalizes to all the
distros. These days most of the distros are running systemd so it's fair
assumption.
Signed-off-by: Sébastien Han <seb@redhat.com>
Once we have our first monitor up and running we need to add it to the
monitor store as a safety measure. Just in case the local file gets
deleted and you need to add a new monitor. Now you can retrieve this key
like this:
ceph config-key get initial_mon_keyring > initial_mon_keyring.txt
Signed-off-by: Sébastien Han <seb@redhat.com>
There is no need to become root on local_action. This will event trigger
an error on some systems as it will try to run a sudo command. If the
current user does not have passwordless sudo, Ansible will fail. Anyway
using the current user is perfectly fine and no elevation privilege is
needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
The Keystone v2 APIs are deprecated and scheduled to be removed in
Q release of Openstack. This adds support for configuring RGW to
use the current Keystone v3 API.
The PKI keys are used to decrypt the Keystone revocation list when
PKI tokens are used. When UUID or Fernet token providers are used in
Keystone, PKI certs may not exist, so we now accommodate this scenario
by allowing the operator to disable the PKI tasks.
Jewel added support for user/pass authentication with Keystone,
allowing deployers to disable Keystone admin token as required
for production deployments.
This implements configuration for the new RGW Keystone user/pass
authentication feature added in Jewel.
See docs here: http://docs.ceph.com/docs/master/radosgw/keystone/
Just for clarity and because we can we now show the name of the
ceph configuration file that is generated.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit solves the situation where you lost your fetch directory and
you are running ansible against an existing cluster. Since no fetch
directory is present the file containing the initial mon keyring
doesn't exist so we are generating a new one.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We do not need to run another condition for 'ceph_rhcs' since the
include we came from already has it, so we are already inside this
condition.
We also spell red hat entirely instead of rh and we remove capital
letters.
Signed-off-by: Sébastien Han <seb@redhat.com>
When `ceph_stable_rh_storage` is True, every cluster node should have a
`/etc/apt/preferences.d/rhcs.pref` file with the following contents:
```
Explanation: Prefer Red Hat packages
Package: *
Pin: release o=/Red Hat/
Pin-Priority: 999
```
ceph-deploy already did this when used with ice-setup, and we need to do
the same thing with the ceph-ansible stack.
Closes: #1182 and https://bugzilla.redhat.com/show_bug.cgi?id=1404515
Signed-off-by: Sébastien Han <seb@redhat.com>
Only when ceph_origin == "upstream", install_on_redhat.yml will include
redhat_ceph_repository.yml, same as debian.
In redhat_ceph_repository.yml, ceph_custom_repo will be added.
But in check_mandatory_vars.yml, ceph_origin=="upstream" can't be combined
with ceph_custom
If previous check was not run, .stdout_lines is not a valid key on the dictionary.
To get around this, use .get("stdout_lines") instead.
Also add in a default empty list
in hammer, ceph-common depended on libcephfs (indirectly, via
python-cephfs). this is no longer the case in jewel or later, so it can
be removed from debian_ceph_packages
Signed-off-by: Casey Bodley <cbodley@redhat.com>
For readibility and clarity we do not run any tasks directly in the
main.yml file. This file should only contain include, which helps us
later to apply conditionnals if we want to.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit re-uses some of the existing ceph-ansible variables for a
containirzed deployment. There is no reasons why we should add new
variables for the containerized deployment.
Signed-off-by: Sébastien Han <seb@redhat.com>
mon_group_name variable can be used to override mons group, but
this task assumes the group is always 'mons'. So we need to use
the var to find the group name instead.
Before this patch only the address for the first mon would show
in the ceph.conf even if there were multiple mons in the inventory.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Once the monitor process starts it will also trigger `ceph-create-keys`
which will collect the admin key and bootstrap keys. We used to force
this command because we were having issues on some distros like centos
7.0 and 7.1 not triggering this. This is fixed on centos 7.2 and not an
issue on ubuntu 14.04 or 16.04 so we can remove this task. If the
monitor hangs or fails to start the playbook will fail right after at
the "wait for client.admin key exists" task after 300sec.
Closes: #1161
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit solves the situation where you lost your fetch directory and
you are running ansible against an existing cluster. Since no fetch
directory is present the file containing the fsid doesn't exist so we
are creating a new one. Later the ceph.conf gets updated with a wrong
fsid which causes problems for clients and ceph processes.
Closes: #1148
Signed-off-by: Sébastien Han <seb@redhat.com>
Adding that avoids this bug:
https://github.com/ansible/ansible/issues/18206
Without that you'll get failures like:
TASK [ceph-mon : set keys permissions]
*****************************************
task path:
/home/andrewschoen/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:31
fatal: [mon0]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
We removed the "apache" setting for "radosgw_frontend" in
adfdf6871e.
As part of that change, we removed the final references to
ceph-extra.repo, but I failed to clean up this file itself.
Now that nothing uses this file, delete it.
This file contained the sole reference to redhat_distro_ceph_extra, so
we can drop that variable as well.
Refactor the code using 'package' module
Fix Issue #520
(However it doesn't cover all cases because some cases are not refactorable.
Ex: because of diverging packages name between distribution)
a397922 introduced a syntax error by attempting to default an unquoted
string, which causes execution failures on some ansible versions with:
Failed to template {{ ceph_rhcs_mount_path }}: Failed to template {{ ceph_stable_rh_storage_mount_path | default(/tmp/rh-storage-mount) }}: template error while templating string: unexpected '/'
libfcgi is dead upstream (http://tracker.ceph.com/issues/16784)
The RGW developers intend to remove libfcgi support entirely before the
Luminous release.
Since libfcgi gets little-to-no developer attention or testing, remove
it entirely from ceph-ansible.
Ansible task was not properly fetching OSD cluster keyring causing
the keyring to be missing when we needed to authenticate. Similarly, we
were not properly waiting on the OSD keyring to be available before
continuing.
Signed-off-by: Ivan Font <ifont@redhat.com>
- Update rolling update playbook to support containerized deployments
for mons, osds, mdss, and rgws
- Skip checking if existing cluster is running when performing a rolling
update
- Fixed bug where we were failing to start the mds container because it
was missing the admin keyring. The admin keyring was missing because
it was not being pushed from the mon host to the ansible host due to
the keyring not being available before running the copy_configs.yml
task include file. Now we forcefully wait for the admin keyring to be
generated before continuing with the copy_configs.yml task include file
- Skip pre_requisite.yml when running on atomic host. This technically
no longer requires specifying to skip tasks containing the with_pkg tag
- Add missing variables to all.docker.sample
- Misc. cleanup
Signed-off-by: Ivan Font <ifont@redhat.com>
We have a fact that detects the package manager, so we can detect if
systemd is used. Radosgw was still using some old logic from Ubuntu.
Ubuntu 16.04 now has systemd so we don't need to configure rgw as it was
running on upstart.
Signed-off-by: Sébastien Han <seb@redhat.com>
ansible 2.2 deprecates first_available_file option which is used in
the config_template module by 'generate ceph configuration file' task.
This change syncs the config_module files from their master repository
in github.com/openstack/openstack/ansible-plugins which includes the fix
2f6cac2cf6
Signed-off-by: Alberto Murillo Silva <alberto.murillo.silva@intel.com>
Even for dmcrypt we need to check the "devices" status and
"raw_journal_devices" as well so we can fix them if there is something
wrong with them.
Signed-off-by: Sébastien Han <seb@redhat.com>
Before this commit if you had set monitor_interface in your
inventory file for a specific host it would be ignored and the value
in group_vars/all would have been used.
Also, this enables support for monitor_address again as it had been
broken by previous changes to this template.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This is done for preventing of their use-before-definition for osd scenarios checks (should be removed after a refactor has properly seperated all the checks into appropriate roles).
Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>
Users reported that pool_default_pg_num is not honoured for the default
pool 'rbd'. So now we check the pg num value for the RBD pool and if it
does not match pool_default_pg_num then we delete and recreate it.
We also make sure the pool is empty first, just in case someone changed
the value manually and didn't reflect the change in ceph-ansible.
The only issue with this patch is that the pool ID will not be 0 anymore
but more likely 1.
Signed-off-by: Sébastien Han <seb@redhat.com>
backward compatibility for ceph-ansible version running latest code but
using variables defined before commit: 492518a2
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-fetch-keys role currently works only if cluster name is 'ceph'.
This commit allows to set custom cluster name in 'defaults' in the same
fashion as other roles do.
This RHCS version is now generally available. Default to using it.
Signed-off-by: Alfredo Deza <adeza@redhat.com>
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Related: rhbz#1357631
By overriding the openstack_pools variable introduced by this commit, the
deployer may choose not to create some of the openstack pools, or to add
new pools which were not foreseen by ceph-ansible, e.g. for a gnocchi
storage backend.
For backwards compatibility, we keep the openstack_glance_pool,
openstack_cinder_pool, openstack_nova_pool and
openstack_cinder_backup_pool variables, although the user may now choose
to specify the pools directly as dictionary literals inside the
openstack_pools list.
This allows us to test devices set with persistent naming such as
/dev/disk/by-*
When registering devices we can use persisent (/dev/disk/by-*) or
non-persistent (/dev/sd*). Both declarations are supported by
ceph-ansible. There was just two tasks that were not compatible with
this. Since we support using partitions directly we need to test that
because the device activation will be different. To test if the device
is a partition we use a regular expression which wasn't compatible with
the persistent device naming format (/dev/disk/by-*).
This commit solves this issue by reading the path of the symlink since
devices like /dev/disk/by-* are symlinks to devices like /dev/sd*
Signed-off-by: Sébastien Han <seb@redhat.com>
For some providers (such as upcoming Linode support), some NICs may have
multiple IP addresses. (In the case of Linode, the only NIC has a public
and private IP address.) This is normally okay as we can use the
ceph.conf cluster_network and public_network variables to force the
monitor to listen on the addresses we want. However, we also need
ansible to set the correct monitor IP addresses in "mon hosts" (i.e. the
addresses the monitors will listen on!). This new monitor_address_block
setting tells ansible which IP address to use for each monitor.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
- Move mon_containerized_default_ceph_conf_with_kv config from ceph-mon
to ceph-common defaults as it's used in ceph-nfs
- Update conditional to generate ganesha config when not
mon_containerized_default_ceph_conf_with_kv
- Revert change to store radosgw keyring using ansible_hostname on
ansible server so that ceph-nfs can find it
- Update ceph-ceph-nfs0-rgw-user container to use ansible_hostname
variable
Signed-off-by: Ivan Font <ivan.font@redhat.com>
use the activation scenario instead of the full ceph_disk one, we
already have a task to prepare osds so we just need to activate the
device.
working for me using vagrant :)
Signed-off-by: Sébastien Han <seb@redhat.com>
There is no need to run the actions from
roles/ceph-mon/tasks/docker/create_configs.yml
on the first monitor only since the monitor deployment happens
**serially**.
Moreover with Vagrant it's useful to allow the auto creation of the
cluster fsid, so enabling the option. If this is not desired you can
still set `fsid: 9c9c0448-0551-401d-b55b-e5b3a42bae42` for example.
Signed-off-by: Sébastien Han <seb@redhat.com>
- Gather facts only for mons before processing ceph-mon role serially in
containerized playbook sample
- Updated ceph.conf in order to generate a valid ceph.conf
Signed-off-by: Ivan Font <ivan.font@redhat.com>
- Move fsal_rgw config to ceph-common, as it's shaered with ceph-rgw
- Update all.docker.sample with NFS config
- Rename fsal_rgw to nfs_obj_gw and fsal_ceph to nfs_file_gw, because
the former names mean nothing to non-Ganesha developers
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
-First install ceph into a directory with CMake
cmake -DCMAKE_INSTALL_LIBEXECDIR=/usr/lib -DWITH_SYSTEMD=ON -DCMAKE_INSTALL_PREFIX:PATH:=/usr <ceph_src_dir> && make DESTDIR=<install_dir> install/strip
-Ceph-ansible copies over the install_dir
-User can use rundep_installer.sh to install any runtime dependencies that ceph needs onto the machine from rundep
* changed s/colocation/collocation/
* declare dmcrypt variable in ceph-common so the variables check does
not fail
Signed-off-by: Sébastien Han <seb@redhat.com>
This fixes#845 for containerized deployments. We now also mount the
/etc/localtime volume in the containers in order to synchronize the host
timezone with the container timezone.
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Prior to this change, each ceph cluster node would end up with several
"qemu-client-$pid.log" files owned by root. The [client] section would
capture *all* client activity (for example the "ceph health" command,
etc), not just librbd-in-qemu.
Restrict this section to libvirt clients only so that we don't generate
these spurious log files for other Ceph client traffic.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Deployment fails when the ``secure_cluster`` is false:
TASK [ceph-mon : secure the cluster]
*******************************************
fatal: [saceph-mon.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
fatal: [saceph-mon2.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
fatal: [saceph-mon3.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
A conditional include evaluates all included tasks with the (additional)
conditional applied to every task [1]. Thus all tasks from `secure_cluster.yml'
are always evaluated (with an additional 'when: secure_cluster' condition).
The `secure the cluster' task iterates over ``ceph_pools.stdout_lines``
even if ``secure_cluster`` is false: in loops ansible applies conditional
to every item (by design) [2]. However the `collect all the pools' task
is skipped if the very same condition evaluates to false, which leaves
the ``ceph_pools`` undefined, so the `secure the cluster' task fails:
Provide the default (empty) list to avoid the problem.
[1] http://docs.ansible.com/ansible/playbooks_conditionals.html#applying-when-to-roles-and-includes
[2] http://docs.ansible.com/ansible/playbooks_conditionals.html#loops-and-conditionalsCloses: #913
Signed-off-by: Alexey Sheplyakov <asheplyakov@mirantis.com>
Update each role's task to use the respective role's username, image
name, and image tag to check if a container is already running. This was
causing false failures because we were not matching any running
containers and subsequently running checks.yml to check the status of
cluster files being left behind.
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Journal size is not mandatory anymore, a default from 5GB is being
added. A simple warning message will show up if the size is set to
something below 5GB.
Signed-off-by: Sébastien Han <seb@redhat.com>
The ceph-common role fails when you run ansible with --check. Adding
always_run to a few tasks makes the check go through easier (although
it's not foolproof).
This will help if the path to the iso exists in the originating server but not
in the remote paths. This issue is not seen if using /tmp/file.iso but does
show up when using nested paths.
Signed-off-by: Alfredo Deza <adeza@redhat.com>
Resolves: rhbz#1355762
init_system was getting the value of "systemd\n"
and was later compared to be equal to "systemd"
making the wrong scripts to be executed.
Signed-off-by: Alberto Murillo <alberto.murillo.silva@intel.com>
Add the ability to use a custom repo, rather than just upstream, RHEL,
and distro. This allows ansible to be used for internal testing.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
If the docker image cannot be retrieved we will fail this task silently
and the playbook ultimately succeeds without a successful deployment.
This change makes it so we fail the playbook immediately.
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Ceph has the ability to export it's filesystem via NFS using Ganesha.
Add a ceph-nfs role that will start Ganesha and export the Ceph
filesystems.
Note that, although support is going in to export RGW via NFS, this is
not working yet.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
The config template is in ceph-common, not in the individual roles, so
roles referencing it need to use playbook_dir, not role_path.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
- Check for nmap being available was not running as a local_action, when the checks using nmap were
- Various fixes on Ansible 2.x now that the above is working
This causes ceph-ansible scripts to fail when targeting Centos7 machines.
Installation fails because newer ceph package dependencies provided
by ceph-release-{version}.noarch.rpm were overridden by older
package dependency versions in default distribution repositories,
due to the fact that default distribution repositories have higher
priority.
Docker makes it difficult to use images that are not on signed
registries. This is a problem for developers, who likely won't have
access to a registry with proper signed certificates.
This allows the ability to use any docker image on the machine running
vagrant/ansible. The way it works is that the image in question is
exported locally, then sent to each target box and imported there.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
This will allow nodes to install rhcs that do
not have access to the internet.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves: rhbz#1337601
In order to align all Ansible versions, we now use the full path for the
template. We rely on `role_path` variable. Now all the tasks using
the template module have a uniform syntax.
Might fix issue raised in #483
Signed-off-by: Sébastien Han <seb@redhat.com>
The scenarios were not being accurately compared to ensure that:
* A single scenario was choosen
* ONLY a single scenario was choosen
This solution does not scale for long, but that can be addressed in a
different patchset.
By default, this roles will create a ceph config file and get the admin
key. You can optionnally add other users, keys and pools for your tests.
Closes: #769
Signed-off-by: Sébastien Han <seb@redhat.com>
Add support to allow ceph-ansible to install and
configure Ceph on Debian on the ppc64le architecture.
Canonical has ppc64le Debian packages in Ubuntu distros
and on Ubuntu Cloud Archive. Both of which can be installed
and configured using the 'distro' or 'uca' options in
ceph-ansible when this patch is used.
Signed-off-by: Samuel Matzek <smatzek@us.ibm.com>
Since ##461 we have been having the ability to override ceph default
options. Previously we had to add a new line in the template and then
another variable as well. Doing a PR for one option was such a pain. As
a result, we now have tons of options that we need to maintain across
all the ceph version, yet another painful thing to do.
This commit removes all the ceph options so they are handled by ceph
directly. If you want to add a new option, feel free to to use the
`ceph_conf_overrides` variable of your `group_vars/all`.
Risks, for those who have been managing their ceph using ceph-ansible
this is not a trivial change as it will trigger a change in your
`ceph.conf` and then restart all your ceph services. Moreover if you did
some specific tweaks as well, prior to run ansible you should update the
`ceph_conf_overrides` variable to reflect your previous changes.
To avoid service restart, you need to know a bit of ansible for this,
but generally the idea would be to run ansible on a dummy host to
generate the ceph.conf, then scp this file to all your ceph hosts and
you should be good.
Closes: #693
Signed-off-by: Sébastien Han <seb@redhat.com>
This is purely a refactor. Converts when 'and' conditionals into lists
rather than multiline strings. This does not work for nested
conditionals, but those can be formated with indents.
Moves one line when statements onto the same line as the when command
itself.
A small logic bug was found in ceph-osd/tasks/check_devices.yml which
which was also fixed.
Signed-off-by: Sam Yaple <sam@yaple.net>
Somehow on CentOS 7.2 with Jewel, the service enablement by the Ansible service module
does not seem to work properly.
Signed-off-by: Sébastien Han <seb@redhat.com>
This adds a helper fact that uses the ``init_system`` fact to determine if
we should be using systemd or not when controlling services.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>