The shrink_rgw scenario has been merge just after the PR about enable
ceph dashboard by default.
So right now the shrink_rgw scenrio doesn't have nodes in the grafana
group and fails.
We just need to set dashboard_enabled to false.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 867583d5dd)
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 236b081a3a)
# Conflicts:
# tox.ini
Add a playbook named shrink-rgw.yml to infrastructure-playbooks/ that
can remove a RGW from a node in an already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 632a44bdf2)
When creating OpenStack pools, we only check if the return code from
the pool list command isn't 0 (ie: if it doesn't exist). In that case,
the return code will be 2. That's why the next condition is rc != 0 for
the pool creation.
But in containerized deployment, the return code could be different if
there's a failure on the container engine command (like container not
running). In that case, the return code could but either 1 (docker) or
125 (podman) so we should fail at this point and not in the next tasks.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d549fffdd2)
there's no dedicated nodes for mgr, let's use monitor nodes.
The mgr0 instance spawned isn't used, so if this node is part of the
inventory for this scenario, testinfra will complain because there's no
ceph.conf on this node.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d649e00893)
The ceph nodes couldn't have the python six library installed which
could lead to error during the ceph_volume custom module execution.
ImportError: No module named six
The six library isn't useful in this module if we're sure that all
action variables passed to the build_ceph_volume_cmd function are a
list and not a string.
Resolves: #4071
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a64a61429d)
Use facility built-in in Ansible to check whether a command was executed
successfully rather looking at its return value.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5aecdd3ba6)
This commit adds a grafana-server section in order to test dashboard
deployment with podman.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c2fd337d9)
this commit adds two checks:
- check that the `[grafana-server]` group is defined
- check that the `[grafana-server]` contains at least one node.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 02beb00916)
this tasks isn't using the right container_exec_cmd, that's delegating
to the wrong node.
Let's use the right fact to fix this command.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ec33ee7574)
According to this comment [1], this seems to be needed to detect wifi
devices.
In node exporter we can see this:
```
--collector.wifi Enable the wifi collector (default: disabled).
```
since it's enabled by default and we don't even change this in our
systemd templates for node-exporter, we can easily assume in the end
it's not needed. Therefore, let's remove this.
[1] dbf81b6b5b (diff-961545214e21efed3b84a9e178927a08L21-L23)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b9cdf341be)
There's no need to add complexity and trying to fallback on other group.
Let's deploy dashboard on all nodes present in grafana-server group.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d67230b2a2)
Move dashboard, grafana/prometheus and node-exporter plays into a
dedicated playbook in infrastructure-playbook directory.
To avoid using 'dashboard_enabled | bool' condition multiple time
in the main playbook we can just import the dashboard playbook or
not.
This patch also allows to use an unique dashboard playbook for
both baremetal and container playbooks.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 43135840b1)
Some NBSP are still present in the yaml files.
Adding a test in travis CI.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 07c6695d16)
Those 2 directories should be renamed to be more generic (docker vs.
podman).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 19950b5170)
This commit adds a when clause to avoid the setup of grafana
provisioners in a fully containerized scenario.
This is needed when the ceph-grafana-dashboards package is not
installed and this task could result in a wrong grafana
configuration that let the container crash.
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit fac1b030cb)
The dashboard rgw frontend options only need to be applied when there's
some nodes present in the rgw ansible group.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5383c2f7f3)
The Vagrant dashboard scenario creates a dedicated grafana node but
was not use in the ansible inventory.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a9a1f633a9)
The current port value for alertmanager, grafana, node-exporter and
prometheus is hardcoded in the roles so it's not possible to change the
port binding of those services.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8ab9b719fa)
Previously cephfs_pools items used to have a pgs: key but not
pgp_num: nor pg_num:
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
(cherry picked from commit edd1420217)
fbf4ed42ae introduced a bug when
container binary is podman.
podman doesn't support ps -f using regular expression, the container id
is never set in the restart script causing the handler to fail.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721536
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 618dbf271d)
ceph-volume will complain if gpt headers are found on devices.
This commit checks whether a gpt header is present on devices passed in
`devices` variable and fail early.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730541
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 487d701685)
this setting is not needed here since we explicitely set it for
container and non container context.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 87b173d022)
This commits adds a check to ensure the daemon has been removed from the
cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 916dc1f52f)
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f80521f773)
# Conflicts:
# tox.ini
Add a playbook named "shrink-rbdmirror.yml" in infrastructure-playbooks/
that removes a RBD Mirror from a node in an already deployed Ceph
cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c4824acb19)
Both ntp and chrony daemon use variable for the service name because it
could be different depending on the GNU/Linux distribution.
This has been update in 9d88d3199 for chrony but only for the start part
not for the handler.
The commit fixes this for both ntp and chrony.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0ae0193144)
The Prometheus porrt 9090 isn't open in the firewall configuration.
Also the dashboard task on the grafana node was not required because
it's already present on the mgr node.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 41b44dde85)
since everything is already in a block with the same condition, it's not
needed to leave all of them on these tasks.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ee29f7370a)
The message prints the whole content of the registered variable in the
playbook, this is not needed and makes the message pretty unclear and
unreadable.
```
"msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!"
```
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e6dc3ebd8c)
We are currently using incorrect dashboard default port. The upstream
uses 8443 instead of 8234 by default. This should get us closer to the
upstream project.
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 21758fcee8)
Some dashboard_rgw_api_* variables are using the bool filter but those
variables are strings with an empty string as default value.
So we should test the variable against an empty string instead of a
bool.
dashboard_rgw_api_host: ''
dashboard_rgw_api_port: ''
dashboard_rgw_api_scheme: ''
dashboard_rgw_api_admin_resource: ''
Resolves: #4179
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5413274412)
- Remove gateway_keyring from the configuration file because it's
not used in ceph-iscsi 3.x release.
- Use config_template instead of template module for iscsi-gateway
configuration file. Because the file is an ini file and we might want
to override more parameters than those present in ceph-ansible.
- Because we can now set the pool name in the configuration, we should
use a variable for that. This is refact with the iscsi_pool_* variables
also used to configure the pool size.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1f2a4f1910)
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5c95c34d4b)
# Conflicts:
# tox.ini
Add a playbook, named "shrink-mgr.yml", in infrastructure-playbooks/
that removes a MGR from a node in an already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f4ea75051b)
This commit refacts the way we check the "mds_to_kill" node is well
stopped.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 7df62fde34)
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 324b3b4a6c)
# Conflicts:
# tox.ini
Add a playbook, named "shrink-mds.yml", in infrastructure-playbooks/
that removes a MDS from a node in an already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 235b1fccc6)
c90f605b5 introduces the default ceph cluster name value in the rgw
socket path for the rgw restart script. But this should use the
`cluster` variable instead.
This commit also fixes this in the osd restart script.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit de7f948b75)
The ability to add nodes with the monitor role to an existing cluster
whose name differs from the default name is fixed.
Signed-off-by: ilyashestopalov <usr.tester@yandex.ru>
(cherry picked from commit 904532c5e2)
According to the OSP pattern, we need the package-install tag
to control what is installed on the host. This commit just add
the missing tag to meet the TripleO requirements.
See: /issues/4197 for details
Fixes: #4197
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 95bd002b35)
On containerized deployment we need to bind mount the ceph-iscsi
directory to avoid writing the logs in the container.
The /var/log/ceph directory isn't use by rbd-targe-api/gw services
because they have their own log directories.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 91bef94b6c)
This commit moves some old variables into ceph-defaults so we can move
the `use_new_ceph_iscsi` fact in ceph-facts role in order.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a781ce881c)
If the user is still using the older packages and does not setup
the target iqn you will just get a vague error message later on.
This adds a check during the validate task, so it is clear to the
user.
Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit 08a6d10c32)
If the user has manually installed ceph-iscsi but is trying to setup a
iscsi object in iscsigws.yml you will just a python crash. This patch
adds a check and more user friendly error message for the case.
Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit 2e1a4328a8)
gateway_ip_list is depreciated and is only used when using the old
ceph-iscsi-config/cli packages that are no longer being developed
(GH repos are archived). Because ceph-iscsi-config/cli is no longer
being worked on, this modifies the tests to stress the ceph-iscsi
based installs.
Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit 1e64efc2f0)