This adds support to allow the install of Ceph from the
Ubuntu Cloud Archive. The Ubuntu Cloud Archive provides newer
release of Ceph than the normal Ubuntu distro repository.
Signed-off-by: Samuel Matzek <smatzek@us.ibm.com>
Since developement versions of Ceph are after infernalis a package split
happened. So basically ceph-mon, ceph-osd, ceph-mds need to be
installed.
Signed-off-by: Sébastien Han <seb@redhat.com>
This will allow a user to conditionally install the ceph package on rpm
based systems. Installing this package is not required or wanted in
versions passed infernalis.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Introducing a new config option: `radosgw_civetweb_bind_ip` which points
to the `ansible_default_ipv4` by default. You can override this
variable. Use ansible facts to put a proper value.
Signed-off-by: Sébastien Han <seb@redhat.com>
This fixes the ceph.conf template so that it will look for an inventory
defined value for monitor_interface or for monitor_interface defined in
a group_vars file.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This fixes a bug where monitor_interface might be set in your inventory
file and not by using group_vars or --extra-vars causing the template to
use the default address of 0.0.0.0 instead of the defined
monitor_interface.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Instead of creating the RBD client socket path three different places
in three different ways, this creates it once. Ceph on OpenStack users
have the option to customize the permissions of the RBD client
directories.
Fixes#687
As written, generating the config file for ceph-mon in Docker yielded:
ERROR: config_template is not a legal parameter in an Ansible task or
handler
This fixes that error condition.
We now check if the device has already been prepared, if we detect a
ceph partition we do not prepare the device.
Also fixed some issues while running on Atomic or CoreOS.
Signed-off-by: Sébastien Han <seb@redhat.com>
fixing the can't open /var/lib/ceph/bootstrap-osd/ceph.keyring: can't
open /var/lib/ceph/bootstrap-osd/ceph.keyring: (13) Permission denied
Signed-off-by: Sébastien Han <seb@redhat.com>
we now have the ability to enable the `cluster` variable with a specific
value that will determine the name of the cluster.
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph.conf.j2 template requires a new line between mon_containerized_deployment_with_kv and fsid variables
With this commit , i have added a new line for better readablity
ceph.conf file generation task in ceph-common role was getting failed
because it ansible cant find defination of varriable mon_containerized_deployment_with_kv
This fix declare mon_containerized_deployment_with_kv under ceph-common/defaults/main.yml which fixes this issue
Signed-off-by: ksingh7 <karan.singh731987@gmail.com>
With Jewel comes a new store to store Ceph object: BlueStore. Adding an
extra scenario might seem like a useless duplication however the
ultimate goal is remove the other roles later. Thus this is easier to
add new role instead of modifying existing one. Once we drop the support
for release older than Jewel we will just remove all the previous
scenario files.
Signed-off-by: Sébastien Han <seb@redhat.com>
Some versions (?) of libvirt provide a 'libvirt' group instead of
'libvirtd'. (Observed with libvirt-daemon-1.2.17-13.el7_2.2.x86_64.)
This makes the RBD client directory owner and group configurable to
allow for this.
this is to allow ceph-authtool to read and write to /var/ and /etc on CentOS Atomic.
Add doc on how to run containerized deployment on RHEL/CentOS Atomic
Signed-off-by: Huamin Chen <hchen@redhat.com>
This would allow users who don't know what interface to provide to
give an IP address to use for the monitor instead.
Note: the includes are needed in ceph.conf.j2 because without them
jinja2 can not properly evaluate the template and will complain about a
missing 'ansible_interface' variable. The includes allow the template to
be evaluated correctly and then the correct include will be used during
render time.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Since we want to activate the OSD when it's a partition we are looking
for a return code that is equal to 0 which means the device is a
partition.
closes: #636
Signed-off-by: Sébastien Han <seb@redhat.com>
* `/var/run/ceph/rbd-clients` is not created automatically
* because it is missing, ceph-rgw complains about missing client
socket on start up; it is because the containing directory is
not there
* so we just add it to the list of directory pre-requisite
* the client-name is actually `rgw.{{ ansible_hostname }}` instead
of just `{{ ansible_hostname }}`
* it matches the directory created under `/var/lib/ceph/radosgw`
* and, it matches the client-name used to create the keyring in
`pre_requisite.yml`
Currently we don't yet support runnings OSDs w/ selinux in
enforcing mode. Thus its better to ensure that ceph-ansible
explicitly makes selinux permissive. This should help in
scenarios such as hyperconverged where OSDs are colocated
with VMs on compute nodes which needs selinux enforcing, but
OSDs don't.
Signed-off-by: Deepak C Shetty <deepakcs@redhat.com>
Where it was located before meant it might be skipped if you don't run
tasks with the package-install tag. This fixes the situation where you
want to configure an rhcs node, but do not want to do any package
installs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
When installing RHCS there is an option to install from distro provided
packages, this commit modifies the check to allow that to happen.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
changing the name of the directory causes issues with git subtree which
will create new commits. Creating a symlink for vagrant to be happy.
Signed-off-by: Sébastien Han <seb@redhat.com>
in order to have a build on the galaxy we need to have a proper
dependency set for ceph-common. On the galaxy ceph-common does not
exist, only ceph.ceph-common is available.
Signed-off-by: Sébastien Han <seb@redhat.com>
this commit introduces the ability to use fqdn for mon/mds name while
generation the ceph.conf file from the template.
Simply turn mon_use_fqdn and or mds_use_fqdn to true to use FQDN.
Signed-off-by: Sébastien Han <seb@redhat.com>
This adds a script, generate_group_vars_sample.sh, that generates
group_vars/*.sample from roles/ceph-*/defaults/main.yml to avoid
discrepancies between the sets of files. It also converts the line
endings in the various main.yml from DOS to Unix, since generating the
samples was spreading the line ending plague around to more files.
0644 should never be a directory mode. 1777 makes it so that any user
can create a ceph client, not just root. (This is helpful if, for
instance, nova-compute is running as non-root.)
Previously, creating pools was skipped if cephx was disabled; instead,
we should only skip key creation if cephx is disabled, and create
pools any time openstack_config is true.
If using another method to generate a consistent fsid, then we can
skip creation of an (unused) cluster UUID file. If cephx is disabled
as well, we can skip creation of the fetch directory entirely.
Skip a number of ceph keyring-related tasks (or remove the keyring
portion of some tasks) when cephx is disabled. Specifically, avoid
generating the initial keyring, which only clutters up the ansible
repo if cephx is not in use.
This commit allows you to set a new variable to 'true' if you want to
have ceph admin key copied over different kind of hosts such as MDS,
OSD, RGW. To enable this just set `copy_admin_key` to true.
Closes: #555
Signed-off-by: Sébastien Han <seb@redhat.com>
When autodiscovering disks, disks can be skipped if either they are
removable, or if they have partitions on them. Skipped actions have no
'rc' attribute, though, so the 'ceph prepare' conditional fails unless
we first check to ensure that the results were not skipped before
checking the return value.
The firewall checks can fail for any number of reasons -- e.g., the
ceph cluster hostnames are unresolvable from the ansible host, or the
ports are filtered by some intermediate hop, etc. Make two changes to
make those checks better:
* Set pipefail when running the checks, so if nmap itself fails the
command will be marked as 'failed'. Specifically, this fixes the
case where the hostnames cannot be resolved.
* Add a new variable, check_firewall, which can be used to disable
checks entirely. Specifically, this fixes the case where some
intermediate firewall filters the ports, so nmap returns "filtered".
If cephx is set to false, the "set keys permissions" task fails with:
file ({# ceph_keys.stdout_lines #}) is absent, cannot continue
This skips that step when cephx is false.
Installs on RHEL with ceph_origin set to distro previously would fail
because no packages would get installed, but all of the checks passed
fine. This adds support for ceph_origin: distro, simply installing the
packages using yum/dnf and assuming that the sysadmin has provided a
repository containing them.
This also supports the use case where Satellite or a similar local
mirror is in use, and the admin does not or cannot use the additional
repositories the role would otherwise add.
The purpose of this is so we can connect to the mons and gather the keys
needed to configure an OSD or additonal MON without having to reconfigure
the existing mons at the same time.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
In our use case we might only be configuring mons and not osds in the
same call, so we don't want to check variables needed for osds when they
are not needed to configure a mon.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
as stated in https://github.com/ansible/ansible/issues/4297
if we register a variable twice and even if a task is skipped the
register will not get overwritten... So we use the fact variant as
mentionned in the ansible issue.
Signed-off-by: Sébastien Han <seb@redhat.com>
While this is not widly used (AFAIK :p) the feature was broken. Thanks
to @zmc for reporting it. You can now set `osd_auto_discovery` to
true in your group_vars/osd and it will go through all the devices
available and will make them OSDs.
Signed-off-by: Sébastien Han <seb@redhat.com>
Currently deploying a MON fails with "bad symbolic permission for mode"
errors due to the file/directory modes not being interpreted as octal
values. This commit updates roles/ceph-common/tasks/main.yml to set
the file/directory modes to strings so they can be interpreted
correctly.
Closes issue #525
run containerized daemons in virtual machines.
to enable it simply do:
`cp site-docker.yml.sample site-docker.yml`
and set `docker: true` in `vagrant_variables.yml`
Signed-off-by: Sébastien Han <seb@redhat.com>
At the moment, all the tasks using the file module are duplicated to have differents ownerships depending on the fact `is_ceph_infernalis`.
The goal of this commit is to have a new logic for this:
- First set facts depending on the `is_ceph_infernalis` fact
- Create the files or directories using the setted facts as ownerships.
We have a requirement to install the packages first without
configuration. These tags should allow us to target the tasks need to do
that.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
as reported in #510 some systems don't have uuidgen installed so we
better use a more global way to generate it. It sounds like python
should be available in case uuidgen is not.
Otherwise we will have to find another way :)
closes#510
Signed-off-by: Sébastien Han <seb@redhat.com>
Currently, all the ceph package installation resources use
"state=latest", which means subsequent runs of the ceph playbooks
could result in ceph being upgraded if there are package updates
available in the selected repo.
This commit adds a new variable to ceph-common called
'upgrade_ceph_packages' which defaults to False. This variable is used
in the package installation resources for ceph packages to determine if
the resource should use "state=present" or "state=latest". If the
variable gets set to True, "state=latest" will be used.
Additionally, we update rolling_update.yml to override
upgrade_ceph_packages to true to permit package upgrades in this
context specifically.
Closes issue #506
It seems that in ansible 2.0 even if a task is skipped by it's `when`
clause not evaluating to true the variables in the play are still
rendered. Because these were not defined in defaults/main.yml ansible
was failing in installs/install_on_redhat where those variables are
being used in a `with_items` stanza.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This change allows for configurable Ceph Conf Directory permissions. This
is required for integrators of Ceph, like OpenStack Cinder, which needs to
read from /etc/ceph for operation.
Use command module instead of shell since we do not do anything fancy
here. Remove the duplicate register.
Signed-off-by: Sébastien Han <seb@redhat.com>
As raised in #466 it is important in order to avoid unnecessary
troubleshooting to check that ceph ports are allowed on the platform.
The check runs a nmap command from the host running Ansible
to all the ceph nodes with their respective ports.
Signed-off-by: Sébastien Han <seb@redhat.com>
Thanks to @cloudnull great patch at
https://github.com/ansible/ansible/pull/12555
we now have the ability to add more configuration options instead of
having to push a PR to add a new option to the template. So you can
dynamically add and remove flags.
To use it, edit `ceph_conf_overrides` in `group_vars/all` like so:
```
ceph_conf_overrides
global:
foo: 12345
bar: 6789
```
Signed-off-by: Sébastien Han <seb@redhat.com>
Because of some permission issue, likely due to the recent ceph user, if
80 is used for civetweb we get:
set_ports_option: cannot bind to 80: 13 (Permission denied)
Changing the port to 8080 until this gets solved.
Signed-off-by: Sébastien Han <seb@redhat.com>
I changed the argument used for starting the mds server. (pre
infernalis)
```
service ceph start mds
```
errors, while
```
service ceph start mds.$hostname
```
correctly starts the service.
I changed the mds directory ownership from ceph:cephh to root:root
again, for pre-infernalis.
And finally, add the ceph_stable_releases checks for the upstart
activation task `for or after infernalis release'.
I have seen a number of failures on this task due to mismatch of
checksum of source file and destination. I suspect this is due to a
race condition caused by several hosts simultaneously copying the same
file to single location on the deployment server.
This change simply updates the 'copy keys to the ansible server' task
by adding 'run_once', which limits the task to being run on a single
MON host.
Closes issue #410
Verify that partitions (for both osd disks and journal disks) are sane
before attempting to prepare the device. Fail if parted fails for
whatever reason.
Closes: #437
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we renamed the variables and removed the old 'docker' variable we
can now collocate container daemons with standard bare metal deployment.
For instance, monitors can be containerized but osds can be deployed
traditionally.
Signed-off-by: Sébastien Han <seb@redhat.com>
It should be used to disable health warnings about number of PGs
being too low if some pools have very few objects bringing down
the average number of objects per pool. This happens when running RadosGW.
The default is 10 and since the warnings only occur with some use cases,
the default here is 10 as well. Set to 20 or more to silence the warnings.
Currently, the fetch directory is created in your working directory
(where ansible is run from). We prefer to not keep any state in this
directory and would prefer to have the fetch directory configurable so
we can store it outside of our code checkout.
This commit creates a new variable in each role called
`fetch_directory` (defaulting to the previous value of 'fetch/'), and
then updates each reference to 'fetch' to use the new variable instead.
Closes issue #383
When multiple monitor hosts attempt to create the fetch directory there
is the potential for the task to fail with:
"OSError: [Errno 17] File exists: 'fetch'"
This appear to be an issue with the file module trying to create the
same directory at the same time when the tasks has been delegated to a
single host.
This commit enables run_once on the affected task which should address
the issue.
This is a rare case but it happens. Since we're just calling
`monitor_interface` and not `hostvars[host]['monitor_interface'],
an error may occur when the current host's interface does not
exist on the other hosts. (eg. eth0 exists for node0, but it does
not exist on node1 and node2)
Fix for this is to use hostvars[host]['monitor_interface']
I'm removing the ceph paritition check from `activate osd(s) when device
is a disk` because the ceph parition does not exist when parted was
registered (on a fresh install). This was causing the activate step to
be skipped.
Currently the OpenStack pools that get created use the default pg_num.
This commit updates the ceph-mon role to allow the pg_num for each pool
to be customised.
Fix back the rolling update playbook.
However every single time the playbook will run it will check for new
packages and install the latest ones. I don't think this is always the
desired behaviour. We need to find a way to conciliate both...
Signed-off-by: Sébastien Han <seb@redhat.com>
Fix the logic for the mandatory devices check so that it applies to
raw_multi_journal and journal_collocation scenarios separately.
This fails otherwise because whichever var is "first" in the or is most
likely undefined.
This will likely one day or another break something. If ceph-disk
complains about a disk just use the purge-cluster.yml playbook first as
it will wipe all the devices.
Signed-off-by: Sébastien Han <seb@redhat.com>
I'm currently getting a KeyError due to missing 'dependencies' on this
role when I attempt to install it with ansible-galaxy (ansible 1.9.2).
This commit simply defines an empty dependencies list so that
ansible-galaxy executes correctly.
Cool stuff :). We don't need to specify an initial monitor key anymore.
A key will automatically be generated.
The default key can always be overriden with the `monitor_secret`
variable.
Signed-off-by: leseb <seb@redhat.com>
We don't always have a dedicated cluster network so we can by default
re-use the public network value.
This is just laziness :).
Signed-off-by: leseb <seb@redhat.com>
While re-running the playbook we do not want to check for new packages.
We shouldn't perform upgrades, we leave this to the operators.
Signed-off-by: leseb <seb@redhat.com>
Prior to this change, the zap was executed during every play, this was
not ideal. Now we do check if there is a 'ceph' partition. If so we skip
the zap.
Signed-off-by: leseb <seb@redhat.com>
Feel so bad about this one...
Now it's fixed, the rgw section will be activated once the rgws hosts
are part of the inventory.
Signed-off-by: leseb <seb@redhat.com>
Even if the subcription command is indempotent it takes around 15/16sec
to get it done. Where with the simple yum check we lower down this to
3sec.
Signed-off-by: leseb <seb@redhat.com>