* Remove leftover files for Coreos
Coreos was replaced by flatcar in 058438a25 but the file was copied
instead of moved.
* Remove workarounds for resolved ansible issues
* boostrap: Use first_found to include per distro
Using directly ID and VARIANT_ID with first_found allow for less manual
includes.
Distro "families" are simply handled by symlinks.
* boostrap: don't set ansible_python_interpreter
- Allows users to override the chosen python_interpreter with group_vars
easily (group_vars have lesser precedence than facts)
- Allows us to use vars at the task scope to use a virtual env
Ansible python discovery has improved, so those workarounds should not
be necessary anymore.
Special workaround for Flatcar, due to upstream ansible not willing to
support it.
* upgrade ansible version
Needed for with_first_found to work correctly:
https://github.com/ansible/ansible/issues/70772 fixed in 2.16
* Remove unused google cloud cloud_playbook
* Fix dpkg_selection on non-existing packages
Needed since ansible-core>2.16, see:
f10d11bcdc
Under the original code, leader election failed for ingress controllers
as a result of mismatch between election-id in the controller config,
and the resourceName in the relevant rule of role 'ingress-nginx'.
This appeared in the controller logs.
To fix the issue, a command-line option was added to container
execution (--election-id=...).
Now, the election-id agrees with the resourceName provided in
the role-ingress-nginx.yml file. A comment in that file was
changed to reflect the new logic.
Co-authored-by: Vasilis Samoladas <vsam@softnet.tuc.gr>
Co-authored-by: Mohamed Omar Zaian <mohamedzaian@gmail.com>
This should avoid permissions problems when the user creating the
directory and the user creating the content are different (when
containers images are saved by root for instances, because the user
can't use the container runtime).
* Refactor of kubeadm images listing
Instead of setting multiples facts, we directly create the dict we need from
kubeadm output.
* Remove useless 'default' filters in roles/download
* Only download kubeadm images where needed
The current state waiting method is bad to implement.
When changing the deployment version, which is execute with the upgrade_cluster in the previous ansible task: "Kubernetes Apps | Install and configure MetalLB", next ansible task: "Kubernetes Apps | Wait for MetalLB controller to be running" may fall with an error.
* [kubernetes] Make kubernetes 1.29.1 default
* [cri-o]: support cri-o 1.29
Use "crio status" instead of "crio-status" for cri-o >=1.29.0
* Remove GAed feature gates SecCompDefault
The SecCompDefault feature gate was removed since k8s 1.29
https://github.com/kubernetes/kubernetes/pull/121246
Handle all old dns domains:
- for nodelocaldns: in the same server block as the current dns_domain
- for coredns: uffix rewrite of each of the old dns domains to the
current one
* Enable configuring mountOptions, reclaimPolicy and volumeBindingMode for cinder-csi StorageClasses
* Check if class.mount_options is defined at all, before generating the option list
* ignore_unreachable for etcd dir cleanup
ignore_errors ignores errors occur within "file" module. However, when
the target node is offline, the playbook will still fail at this task
with node "unreachable" state. Setting "ignore_unreachable: true" allows
the playbook to bypass offline nodes and move on to proceed recovery
tasks on remaining online nodes.
* Re-arrange control plane recovery runbook steps
* Remove suggestion to manually update IP addresses
The suggestion was added in 48a182844c 4
years ago. But a new task added 2 years ago, in
ee0f1e9d58, automatically update API
server arg with updated etcd node ip addresses. This suggestion is no
longer needed.
* update cilium configmap template for new routing mode and tunnel-protocol options
Ryan Lonergan ryan.tlonergan@gmail.com
* add rbac for new cilium crd in 1.14
Ryan Lonergan ryan.tlonergan@gmail.com
* add conditional for cni-install.sh that's no longer included in cilium 1.14
Ryan Lonergan ryan.tlonergan@gmail.com
* Update roles/network_plugin/cilium/templates/cilium/ds.yml.j2
Co-authored-by: Cyclinder <qifeng.guo@daocloud.io>
---------
Co-authored-by: Cyclinder <qifeng.guo@daocloud.io>
* Disable control plane allocating podCIDR for nodes when using calico
Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.
If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.
Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.
* Don't default to kube_network_node_prefix for calico_pool_blocksize
They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).
Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
We take advantage of group_by to create the list of nodes needing new
certs, instead of manually looping inside a Jinja template.
This should make the role more readable and less susceptible to
white space problems.
* Decouple role kubespray-defaults from download
Avoids doing re-importing the download role on every invocation of
kubespray-defaults (and skipping everything).
This has a measurable effect on playbook performance.
* Update docs refering to moved download defaults
* Mask systemd swap.target do disable swap
This is a more generic way to disable swap, since it pulls .swap units
in systemd distributions; fstab is only one way to generate .swap units.
* Unconditionally disable swap
We only care to disable it (the "swapon" registered variable is not used
anywhere else.
This allows to get rid of the ignore_errors, since this was added
because swapon.stdout does not exist in check_mode (see issue #6642).
* Don't explicitly disable swapOnZram
We're already masking the swap.target, which would pull the zram unit,
hence no need to handle zram-generator specifically.
* Clean up redondant defaulting
drain_{timeout,grace_period}_after_failure don't exist at this point, so
they always default.
* Remove useless facts
The drain_*_after_failure are never used
* Try both conntrack modules instead of checking kernel version
Depending on kernel distributor, the kernel version might not be a
correct indicator of the conntrack module use.
Instead, we check both (and use the first found).
* Use modproble.persistent rather than manual persistence