For newly created cluster the command: ceph --cluster {{ cluster }} osd
pool get rbd size does not respond properly.
We only want to check if the rbd pool exists, so we know use an ls |
grep approach.
Closes: https://github.com/ceph/ceph-ansible/issues/1547
Signed-off-by: Sébastien Han <seb@redhat.com>
The fact ['ansible_$interface']['ipv4'] is a dictionary where
['ansible_$interface']['ipv6'] is a list. If we use
ansible_default_ipv6|ipv4 is is always a dictionary which allows us to
get the ipv6 and ipv4 address without adding more complexity to the
template.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
`ceph-docker-common`:
At the moment there is a lot of duplicated tasks in each
`./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in
`./roles/ceph-docker-common/tasks/main.yml`.
`*_containerized_deployment` variables:
All `*_containerized_deployment` have been refactored to a single
variable `containerized_deployment`
duplicate `cephx` variables in `group_vars/* have been removed.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Problem: fail to deploy a containerized Ceph cluster with ipv6
Solution: do not hardcode ipv4 when bootstrapping the container.
Now use ip_version: ipv6 to get a containerized cluster deployed with
ipv6.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1451786
Signed-off-by: Sébastien Han <seb@redhat.com>
The CI on Docker is reporting the following error:
STDERR:
Error EINVAL: bad entity name
This is due to the fact that this auth entity name does not exist on
Jewel so we should not create that key when running Jewel containers.
Fixes: https://github.com/ceph/ceph-ansible/issues/1514
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this change, ansible was only checking for the existence of the
package, now if upgrade_ceph_packages is true this means we are
performing an upgrade.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1442016
Signed-off-by: Sébastien Han <seb@redhat.com>
This is to allow ceph-mgr daemons to remote control
osd and mds daemons with MCommand messages.
Fixes: http://tracker.ceph.com/issues/19713
Signed-off-by: John Spray <john.spray@redhat.com>
Without this, we don't test the mgr role so we need to add it.
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Ansible evaluates the 'with_items' before the 'when' so if the inventory
does not have the group declared it'll fail. To fix this, we set an
empty array to make the with_items happy and then evaluate with the
'when'.
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this change we were deploying a monitor using tis fqdn name but
we were checking its state and performing actions on it using its
shortname.
Signed-off-by: Sébastien Han <seb@redhat.com>
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.
Only works as of the Kraken release.
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-create-keys unit file was removed here:
* 8bcb4646b6
* dc5fe8d415
As a consequence the systemctl preset command now fails to run since the
unit does not exist anymore. Due to the redirection in /dev/null we
don't know what's happening.
Ultimately the mon unit doesn't get enabled and the mon service won't
start after reboot.
Removing the old/non-existent unit makes the command succeed now.
ceph fix: https://github.com/ceph/ceph/pull/14226
Signed-off-by: WingkaiHo <sanguosfiang@163.com>
Co-Authored-By: Sébastien Han <seb@redhat.com>
After the jewel release the mon startup does not generate keys, but it's
still harmless to call ceph-create-keys with jewel because this task has
a 'creates' argument that will cause it not to run if the keys already
exist.
Removing this when condition also allows the downstream CI tests to
install kraken or luminous without resetting ceph_stable_release, which does not
pertain to rhcs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Sometimes the socket appears during the 5th attempt and sometimes not so
increasing the timeout a little bit.
Signed-off-by: Sébastien Han <seb@redhat.com>
Add the possibility to create openstack pools and keys even for containerized deployments
Fix: #1321
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This patch introduces calamari_debug option which will turn on debugging
for calamari before initializing and running it.
Signed-off-by: Boris Ranto <branto@redhat.com>
Install package from official repos rather than pip when using RHEL.
This commit fix https://bugzilla.redhat.com/show_bug.cgi?id=1420855
Also this commit Refact all `roles/ceph-*/tasks/docker/pre_requisite.yml`
to avoid a lot of duplicated code.
Fix: #1303
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Check if ceph filesystem already exists before creating it.
If the ceph filesystem doesn't exist, execute the task only on one node.
Fix: #1314
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We changed the way we declare image.
Prior to this patch we must have a "user/image:tag"
format, which is incompatible with non docker-hub registry where you
usually don't have a "user". On the docker hub a "user" is also
identified as a namespace, so for Ceph the user was "ceph".
Variables have been simplified with only:
* ceph_docker_image
* ceph_docker_image_tag
1. For docker hub images: ceph_docker_name: "ceph/daemon" will give
you the 'daemon' image of the 'ceph' user.
2. For non docker hub images: ceph_docker_name: "daemon" will simply
give you the "daemon" image.
Infrastructure playbooks have been modified as well.
The file group_vars/all.docker.yml.sample has been removed as well.
It is hard to maintain since we have to generate it manually. If
you want to configure specific variables for a specific daemon simply
edit group_vars/$DAEMON.yml
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1420207
Signed-off-by: Sébastien Han <seb@redhat.com>
We shouldn't test directly the value of
`ceph_conf_overrides.global.osd_pool_default_pg_num` because this can
cause the playbook to fail if the key `global` is not present in
`ceph_conf_overrides`. Therefore we have to use the facts that have been
defined earlier.
Fix: #1242
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
On ubntu systems mkdir is in /bin where on atomic it is /usr/bin/.
We use the shell built-in function "command" to find its right location.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we now only support systemd has an init system we can finally
treat containers as processes using systemd and this for all the
distros.
Signed-off-by: Sébastien Han <seb@redhat.com>
According to #1216, we need to simply the code by removing the
support of anything before Jewel.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This patch makes sure we set the proper pool size on the rbd pool.
Usually during bootstrap the rbd pool size is not honoured so we need to
add this workaround.
Signed-off-by: Sébastien Han <seb@redhat.com>
could have scenario where different openstack components would
use the same pool, but the logic would create the same pool
more than once
add unique filter to account for this
It is not enough to check for the mds to exists, it actually always does
because we declare the variable. So we need to make sure that there is a
mds host.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we introduced config_overrides we removed a lot of options from
the default template. In some cases, like mds pool, openstack pools etc
we need to know the amount of PGs required. The idea here is to skip the
task if ceph_conf_overrides.global.osd_pool_default_pg_num is not define
in your `group_vars/all.yml`.
Closes: #1145
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Task put initial mon keyring in mon kv store from
ceph-mon/tasks/ceph_keys.yml is failing when cephx is disabled. The root
cause is that variable monitor_keyring is not populated by any task from
deploy_monitors.yml.
Fixes: #1211
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this patch we had several ways to runs containers, we could use
ansible's docker module on some distro and on containers distros we were
using systemd. We strongly believe threating containers as services with
systemd is the right approach so this patch generalizes to all the
distros. These days most of the distros are running systemd so it's fair
assumption.
Signed-off-by: Sébastien Han <seb@redhat.com>
Once we have our first monitor up and running we need to add it to the
monitor store as a safety measure. Just in case the local file gets
deleted and you need to add a new monitor. Now you can retrieve this key
like this:
ceph config-key get initial_mon_keyring > initial_mon_keyring.txt
Signed-off-by: Sébastien Han <seb@redhat.com>
Just for clarity and because we can we now show the name of the
ceph configuration file that is generated.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit solves the situation where you lost your fetch directory and
you are running ansible against an existing cluster. Since no fetch
directory is present the file containing the initial mon keyring
doesn't exist so we are generating a new one.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If previous check was not run, .stdout_lines is not a valid key on the dictionary.
To get around this, use .get("stdout_lines") instead.
Also add in a default empty list
For readibility and clarity we do not run any tasks directly in the
main.yml file. This file should only contain include, which helps us
later to apply conditionnals if we want to.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit re-uses some of the existing ceph-ansible variables for a
containirzed deployment. There is no reasons why we should add new
variables for the containerized deployment.
Signed-off-by: Sébastien Han <seb@redhat.com>
Once the monitor process starts it will also trigger `ceph-create-keys`
which will collect the admin key and bootstrap keys. We used to force
this command because we were having issues on some distros like centos
7.0 and 7.1 not triggering this. This is fixed on centos 7.2 and not an
issue on ubuntu 14.04 or 16.04 so we can remove this task. If the
monitor hangs or fails to start the playbook will fail right after at
the "wait for client.admin key exists" task after 300sec.
Closes: #1161
Signed-off-by: Sébastien Han <seb@redhat.com>
Adding that avoids this bug:
https://github.com/ansible/ansible/issues/18206
Without that you'll get failures like:
TASK [ceph-mon : set keys permissions]
*****************************************
task path:
/home/andrewschoen/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:31
fatal: [mon0]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Refactor the code using 'package' module
Fix Issue #520
(However it doesn't cover all cases because some cases are not refactorable.
Ex: because of diverging packages name between distribution)
- Update rolling update playbook to support containerized deployments
for mons, osds, mdss, and rgws
- Skip checking if existing cluster is running when performing a rolling
update
- Fixed bug where we were failing to start the mds container because it
was missing the admin keyring. The admin keyring was missing because
it was not being pushed from the mon host to the ansible host due to
the keyring not being available before running the copy_configs.yml
task include file. Now we forcefully wait for the admin keyring to be
generated before continuing with the copy_configs.yml task include file
- Skip pre_requisite.yml when running on atomic host. This technically
no longer requires specifying to skip tasks containing the with_pkg tag
- Add missing variables to all.docker.sample
- Misc. cleanup
Signed-off-by: Ivan Font <ifont@redhat.com>
Users reported that pool_default_pg_num is not honoured for the default
pool 'rbd'. So now we check the pg num value for the RBD pool and if it
does not match pool_default_pg_num then we delete and recreate it.
We also make sure the pool is empty first, just in case someone changed
the value manually and didn't reflect the change in ceph-ansible.
The only issue with this patch is that the pool ID will not be 0 anymore
but more likely 1.
Signed-off-by: Sébastien Han <seb@redhat.com>
By overriding the openstack_pools variable introduced by this commit, the
deployer may choose not to create some of the openstack pools, or to add
new pools which were not foreseen by ceph-ansible, e.g. for a gnocchi
storage backend.
For backwards compatibility, we keep the openstack_glance_pool,
openstack_cinder_pool, openstack_nova_pool and
openstack_cinder_backup_pool variables, although the user may now choose
to specify the pools directly as dictionary literals inside the
openstack_pools list.
- Move mon_containerized_default_ceph_conf_with_kv config from ceph-mon
to ceph-common defaults as it's used in ceph-nfs
- Update conditional to generate ganesha config when not
mon_containerized_default_ceph_conf_with_kv
- Revert change to store radosgw keyring using ansible_hostname on
ansible server so that ceph-nfs can find it
- Update ceph-ceph-nfs0-rgw-user container to use ansible_hostname
variable
Signed-off-by: Ivan Font <ivan.font@redhat.com>
There is no need to run the actions from
roles/ceph-mon/tasks/docker/create_configs.yml
on the first monitor only since the monitor deployment happens
**serially**.
Moreover with Vagrant it's useful to allow the auto creation of the
cluster fsid, so enabling the option. If this is not desired you can
still set `fsid: 9c9c0448-0551-401d-b55b-e5b3a42bae42` for example.
Signed-off-by: Sébastien Han <seb@redhat.com>
-First install ceph into a directory with CMake
cmake -DCMAKE_INSTALL_LIBEXECDIR=/usr/lib -DWITH_SYSTEMD=ON -DCMAKE_INSTALL_PREFIX:PATH:=/usr <ceph_src_dir> && make DESTDIR=<install_dir> install/strip
-Ceph-ansible copies over the install_dir
-User can use rundep_installer.sh to install any runtime dependencies that ceph needs onto the machine from rundep
This fixes#845 for containerized deployments. We now also mount the
/etc/localtime volume in the containers in order to synchronize the host
timezone with the container timezone.
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Deployment fails when the ``secure_cluster`` is false:
TASK [ceph-mon : secure the cluster]
*******************************************
fatal: [saceph-mon.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
fatal: [saceph-mon2.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
fatal: [saceph-mon3.vm.ceph.asheplyakov]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'stdout_lines'"}
A conditional include evaluates all included tasks with the (additional)
conditional applied to every task [1]. Thus all tasks from `secure_cluster.yml'
are always evaluated (with an additional 'when: secure_cluster' condition).
The `secure the cluster' task iterates over ``ceph_pools.stdout_lines``
even if ``secure_cluster`` is false: in loops ansible applies conditional
to every item (by design) [2]. However the `collect all the pools' task
is skipped if the very same condition evaluates to false, which leaves
the ``ceph_pools`` undefined, so the `secure the cluster' task fails:
Provide the default (empty) list to avoid the problem.
[1] http://docs.ansible.com/ansible/playbooks_conditionals.html#applying-when-to-roles-and-includes
[2] http://docs.ansible.com/ansible/playbooks_conditionals.html#loops-and-conditionalsCloses: #913
Signed-off-by: Alexey Sheplyakov <asheplyakov@mirantis.com>
The config template is in ceph-common, not in the individual roles, so
roles referencing it need to use playbook_dir, not role_path.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
Docker makes it difficult to use images that are not on signed
registries. This is a problem for developers, who likely won't have
access to a registry with proper signed certificates.
This allows the ability to use any docker image on the machine running
vagrant/ansible. The way it works is that the image in question is
exported locally, then sent to each target box and imported there.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
In order to align all Ansible versions, we now use the full path for the
template. We rely on `role_path` variable. Now all the tasks using
the template module have a uniform syntax.
Might fix issue raised in #483
Signed-off-by: Sébastien Han <seb@redhat.com>
This is purely a refactor. Converts when 'and' conditionals into lists
rather than multiline strings. This does not work for nested
conditionals, but those can be formated with indents.
Moves one line when statements onto the same line as the when command
itself.
A small logic bug was found in ceph-osd/tasks/check_devices.yml which
which was also fixed.
Signed-off-by: Sam Yaple <sam@yaple.net>
Somehow on CentOS 7.2 with Jewel, the service enablement by the Ansible service module
does not seem to work properly.
Signed-off-by: Sébastien Han <seb@redhat.com>
As written, generating the config file for ceph-mon in Docker yielded:
ERROR: config_template is not a legal parameter in an Ansible task or
handler
This fixes that error condition.
fixing the can't open /var/lib/ceph/bootstrap-osd/ceph.keyring: can't
open /var/lib/ceph/bootstrap-osd/ceph.keyring: (13) Permission denied
Signed-off-by: Sébastien Han <seb@redhat.com>
we now have the ability to enable the `cluster` variable with a specific
value that will determine the name of the cluster.
Signed-off-by: Sébastien Han <seb@redhat.com>
this is to allow ceph-authtool to read and write to /var/ and /etc on CentOS Atomic.
Add doc on how to run containerized deployment on RHEL/CentOS Atomic
Signed-off-by: Huamin Chen <hchen@redhat.com>
in order to have a build on the galaxy we need to have a proper
dependency set for ceph-common. On the galaxy ceph-common does not
exist, only ceph.ceph-common is available.
Signed-off-by: Sébastien Han <seb@redhat.com>
This adds a script, generate_group_vars_sample.sh, that generates
group_vars/*.sample from roles/ceph-*/defaults/main.yml to avoid
discrepancies between the sets of files. It also converts the line
endings in the various main.yml from DOS to Unix, since generating the
samples was spreading the line ending plague around to more files.
Previously, creating pools was skipped if cephx was disabled; instead,
we should only skip key creation if cephx is disabled, and create
pools any time openstack_config is true.
Skip a number of ceph keyring-related tasks (or remove the keyring
portion of some tasks) when cephx is disabled. Specifically, avoid
generating the initial keyring, which only clutters up the ansible
repo if cephx is not in use.
If cephx is set to false, the "set keys permissions" task fails with:
file ({# ceph_keys.stdout_lines #}) is absent, cannot continue
This skips that step when cephx is false.
run containerized daemons in virtual machines.
to enable it simply do:
`cp site-docker.yml.sample site-docker.yml`
and set `docker: true` in `vagrant_variables.yml`
Signed-off-by: Sébastien Han <seb@redhat.com>
At the moment, all the tasks using the file module are duplicated to have differents ownerships depending on the fact `is_ceph_infernalis`.
The goal of this commit is to have a new logic for this:
- First set facts depending on the `is_ceph_infernalis` fact
- Create the files or directories using the setted facts as ownerships.
I changed the argument used for starting the mds server. (pre
infernalis)
```
service ceph start mds
```
errors, while
```
service ceph start mds.$hostname
```
correctly starts the service.
I changed the mds directory ownership from ceph:cephh to root:root
again, for pre-infernalis.
And finally, add the ceph_stable_releases checks for the upstart
activation task `for or after infernalis release'.
I have seen a number of failures on this task due to mismatch of
checksum of source file and destination. I suspect this is due to a
race condition caused by several hosts simultaneously copying the same
file to single location on the deployment server.
This change simply updates the 'copy keys to the ansible server' task
by adding 'run_once', which limits the task to being run on a single
MON host.
Closes issue #410
Since we renamed the variables and removed the old 'docker' variable we
can now collocate container daemons with standard bare metal deployment.
For instance, monitors can be containerized but osds can be deployed
traditionally.
Signed-off-by: Sébastien Han <seb@redhat.com>
Currently, the fetch directory is created in your working directory
(where ansible is run from). We prefer to not keep any state in this
directory and would prefer to have the fetch directory configurable so
we can store it outside of our code checkout.
This commit creates a new variable in each role called
`fetch_directory` (defaulting to the previous value of 'fetch/'), and
then updates each reference to 'fetch' to use the new variable instead.
Closes issue #383
Currently the OpenStack pools that get created use the default pg_num.
This commit updates the ceph-mon role to allow the pg_num for each pool
to be customised.
Cool stuff :). We don't need to specify an initial monitor key anymore.
A key will automatically be generated.
The default key can always be overriden with the `monitor_secret`
variable.
Signed-off-by: leseb <seb@redhat.com>
We want to force the user to only enable the options they need. Thus
they shouldn't have to enable one option and then disable another.
Signed-off-by: leseb <seb@redhat.com>
Now we don't need to activate the services through a variable. If the
role is activated in the inventory, actions will occur automatically.
Fixing the repo creation for red hat storage too.
Signed-off-by: leseb <seb@redhat.com>
Following the best practice, we don't create a key from the monitor but
we really on the initial keys created by the mons to bootstrap each
daemon.
Signed-off-by: Sébastien Han <seb@redhat.com>
This branch has been sitting on my local repo for a while. I guess I had
time to spend on a plane :).
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
Depending on the distro, init scripts will look for different files to
be available on the ceph data dir.
Fixing the upstart support here.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
Now the Ceph REST API can be deployed.
Default implementation deploys it on the same nodes as the monitors
which should be fine.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
Fix the usage of Upstart for Ubuntu machines instead of the init.d
script.
Note that because of the way upstart init script looks at the radosgw id
the command 'start radosgw id=' is broken, you should use 'start
radosgw-all' instead.
Keep backard compatibility with the radosgw init script as well by using
client prefixed by 'client.radosgw'.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
The ceph fs new command was introduced in Ceph 0.84. Prior to this
release, no manual steps are required to create a filesystem, and pools
named data and metadata exist by default.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
We isolated the key operations into a file and modified the fetch
function to collect all the new keys.
In the mean time fixed the pool creation since the command is not
indempotent.
Renamed the rgw key to work with the key collection.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
Default behavior is to fail if a variable is not declared however this
can be disable in your ansible.cfg so we force this variable as
mandatory.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
MDS and RGW are not deployed often (RGW more), so we disable them from
the default deployment to only get MONs and OSDs.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>