Based on #718 introduced by rsmitty.
Includes all roles and all options to support deployment of
new hosts in case they were added to inventory.
Main difference here is that master role is evaluated first
so that master components get upgraded first.
Fixes#694
- Refactor 'Check if bootstrap is needed' as ansible loop. This allows
to add new elements easily without refactoring. Add pip to the list.
- Refactor 'Install python 2.x' task to run once if any of rc
codes != 0. Actually, need_bootstrap is array of hashes, so map will
allow to get single array of rc statuses. So if status is not zero it
will be sorted and the last element will be get, converted to bool.
Closes: #961
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
"shell" step doesn't support check mode, which currently leads to failures,
when Ansible is being run in check mode (because Ansible doesn't run command,
assuming that command might have effect, and no "rc" or "output" is registered).
Setting "check_mode: no" allows to run those "shell" commands in check mode
(which is safe, because those shell commands doesn't have side effects).
always_run was deprecated in Ansible 2.2 and will be removed in 2.4
ansible logs contain "[DEPRECATION WARNING]: always_run is deprecated.
Use check_mode = no instead". This patch fix deprecation.
Since systemd kubelet.service has {{ ssl_ca_dirs }}, fact should be
gathered before writing kubelet.service.
Closes: #1007
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
- Exclude kubelet CPU/RAM (kube-reserved) from cgroup. It decreases a
chance of overcommitment
- Add a possibility to modify Kubelet node-status-update-frequency
- Add a posibility to configure node-monitor-grace-period,
node-monitor-period, pod-eviction-timeout for Kubernetes controller
manager
- Add Kubernetes Relaibility Documentation with recomendations for
various scenarios.
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
kubelet lost the ability to load kernel modules. This
puts that back by adding the lib/modules mount to kubelet.
The new variable kubelet_load_modules can be set to true
to enable this item. It is OFF by default.
Daemonsets cannot be simply upgraded through a single API call,
regardless of any kubectl documentation. The resource must be
purged and then recreated in order to make any changes.
Also make no-resolv unconditional again. Otherwise, we may end up in
a resolver loop. The resolver loop was the cause for the piling up
parallel queries.
Reduce election timeout to 5000ms (was 10000ms)
Raise heartbeat interval to 250ms (was 100ms)
Remove etcd cpu share (was 300)
Make etcd_cpu_limit and etcd_memory_limit optional.
Netchecker is rewritten in Go lang with some new args instead of
env variables. Also netchecker-server no longer requires kubectl
container. Updating playbooks accordingly.
- Remove weave CPU limits from .gitlab-ci.yml. Closes: #975
- Fix weave version in documentation
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
- Docker 1.12 and further don't need nsenter hack. This patch removes
it. Also, it bumps the minimal version to 1.12.
Closes#776
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
- Set recommended CPU settings
- Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
- Limit DS creation to master
- Combined 2 tasks into one with better condition
When DNSMasq is configured to read its settings
from a folder ('-7' or '--conf-dir' option) it only
checks that the directory exists and doesn't fail if
it's empty. It could lead to a situation when DNSMasq
is running and handles requests, but not properly
configured, so some of queries can't be resolved.
For consistancy with kubernetes services we should use the same
hostname for nodes, which is 'ansible_hostname'.
Also fixing missed 'kube-node' in templates, Calico is installed
on 'k8s-cluster' roles, not only 'kube-node'.
Calico-rr is broken for deployments with separate k8s-master and
k8s-node roles. In order to fix it we should peer k8s-cluster
nodes with calico-rr, not just k8s-node. The same for peering
with routers.
Closes#925
ndots creates overhead as every pod creates 5 concurrent connections
that are forwarded to sky dns. Under some circumstances dnsmasq may
prevent forwarding traffic with "Maximum number of concurrent DNS
queries reached" in the logs.
This patch allows to configure the number of concurrent forwarded DNS
queries "dns-forward-max" as well as "cache-size" leaving the default
values as they were before.
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>