Commit Graph

5153 Commits (2214071a82144819a946389e07566021420cf33c)

Author SHA1 Message Date
Max Gautier f5474ec6cc
Don't try to set permissions recursively on cache+staging directory (#10900)
This should avoid permissions problems when the user creating the
directory and the user creating the content are different (when
containers images are saved by root for instances, because the user
can't use the container runtime).
2024-02-09 06:04:28 -08:00
Max Gautier 4b0a134bc9
Only download kubeadm images where needed (#10899)
* Refactor of kubeadm images listing

Instead of setting multiples facts, we directly create the dict we need from
kubeadm output.

* Remove useless 'default' filters in roles/download

* Only download kubeadm images where needed
2024-02-08 02:14:45 -08:00
flxbwr ad565ad922
Fix waiting for MetalLB controller (#10858)
The current state waiting method is bad to implement.
When changing the deployment version, which is execute with the upgrade_cluster in the previous ansible task: "Kubernetes Apps | Install and configure MetalLB", next ansible task: "Kubernetes Apps | Wait for MetalLB controller to be running" may fall with an error.
2024-02-06 02:58:59 -08:00
Max Gautier 6f419aa18e
Revert "implement download mirrors support (#8474)" (#10884)
This reverts commit c6e5314fab.

There is no user of the download mirrors support in kubespray, for a
long time.
2024-02-06 00:48:29 -08:00
anders-elastisys c698790122
add nat_outgoing_ipv6 to calico defaults and docs (#10866) 2024-02-05 23:14:22 -08:00
Gianmarco Mameli 989ba207e9
task description modified (#10875) 2024-02-05 07:59:04 -08:00
Max Gautier f2bdd4bb2f
Fix logical error when checking for boostrap-os (#10867)
Also remove some clutter along the way.
2024-02-05 07:58:55 -08:00
Kay Yan c9a44e4089
make docker 24.0 default (#10873)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2024-02-04 21:55:19 -08:00
kyrie 0dbde7536f
make containerd 1.7.12 default and upgrade runc to v1.1.11 (#10862)
Signed-off-by: KubeKyrie <shaolong.qin@daocloud.io>
2024-02-01 04:06:08 -08:00
Victor Login 8d53c1723c
bump coredns version to 1.11.1 (#10719)
* update version coredns 1.11.1

* Update roles/kubespray-defaults/defaults/main/download.yml

Co-authored-by: Mohamed Omar Zaian <mohamedzaian@gmail.com>

---------

Co-authored-by: Mohamed Omar Zaian <mohamedzaian@gmail.com>
2024-02-01 03:28:20 -08:00
Mohamed Omar Zaian dce68e6839
[feat] Update metrics server to v0.7.0 (#10856) 2024-01-31 05:13:26 -08:00
Takuya Murakami 785366c2de
[kubernetes] Support kubernetes 1.29 (#10820)
* [kubernetes] Make kubernetes 1.29.1 default

* [cri-o]: support cri-o 1.29

Use "crio status" instead of "crio-status" for cri-o >=1.29.0

* Remove GAed feature gates SecCompDefault

The SecCompDefault feature gate was removed since k8s 1.29
https://github.com/kubernetes/kubernetes/pull/121246
2024-01-31 00:57:23 -08:00
Saber 1d119f1a3c
Fixed grammar (#10853) 2024-01-29 17:46:58 -08:00
Ugur Can Ozturk 7863fde552
[apiserver-kubelet/tracing]: add distributed tracing config variables (#10795)
* [apiserver-kubelet/tracing]: add distributed tracing config flags

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [apiserver-kubelet/tracing]: add distributed tracing config flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [apiserver-kubelet/tracing]: add distributed tracing config flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

---------

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>
2024-01-25 10:24:35 +01:00
kimsehwan96 758d34a7d1 Fix typo mistake in roles/kubernetes/control-plane/tasks/define-first-kube-control.yml
- Fix 'Set fact joined_control_panes' into 'Set fact joined_control_planes'
2024-01-24 13:39:39 +01:00
Max Gautier c80f2cd573
Allow the DNS stack to be backward compatible with an old dns_domain (#10630)
Handle all old dns domains:
- for nodelocaldns: in the same server block as the current dns_domain
- for coredns: uffix rewrite of each of the old dns domains to the
  current one
2024-01-24 06:31:22 +01:00
Maxime Leroy ab0163a3ad
fix(kubernetes): taint nodes with kubectl (#10705)
Signed-off-by: Maxime Leroy <19607336+maxime1907@users.noreply.github.com>
2024-01-23 15:46:13 +01:00
Daniel Strufe 2eb588bed9
Update external huawei cloud controller to 0.26.6 (#10824)
* Update huaweicloud controller to 0.26.6

See <https://github.com/kubernetes-sigs/cloud-provider-huaweicloud/compare/v0.26.3...v0.26.6>

* Update huaweicloud sample to use 0.26.6
2024-01-23 09:28:00 +01:00
Louis Tu a88bad7947
Add scheduler plugins support (#10747)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2024-01-23 07:42:33 +01:00
Max Gautier 89d42a7716
Fix coredns_dual usage (#10821) 2024-01-22 18:36:16 +01:00
yun 13e1f33898
Correct the POLY1305 cipher suites by adding the suffix _SHA256 (#10641) 2024-01-22 18:00:52 +01:00
Alexander de2c4429a4
Enable configuring mountOptions, reclaimPolicy and volumeBindingMode … (#10450)
* Enable configuring mountOptions, reclaimPolicy and volumeBindingMode for cinder-csi StorageClasses

* Check if class.mount_options is defined at all, before generating the option list
2024-01-22 18:00:34 +01:00
Max Gautier 22bb0976d5
Adjust kubelet_event_record_qps to K8S default (#10826)
Also remove redundant check in the kubelet config template (we define a
default, so the setting will always be "true")
2024-01-22 17:49:14 +01:00
my-git9 5a405336ae
Support following k8s version selection pause image (#10756)
Signed-off-by: xin.li <xin.li@daocloud.io>
2024-01-22 17:28:09 +01:00
Yuhao Zhang 0e971a37aa
Offline control plane recover (#10660)
* ignore_unreachable for etcd dir cleanup

ignore_errors ignores errors occur within "file" module. However, when
the target node is offline, the playbook will still fail at this task
with node "unreachable" state. Setting "ignore_unreachable: true" allows
the playbook to bypass offline nodes and move on to proceed recovery
tasks on remaining online nodes.

* Re-arrange control plane recovery runbook steps

* Remove suggestion to manually update IP addresses

The suggestion was added in 48a182844c 4
years ago. But a new task added 2 years ago, in
ee0f1e9d58, automatically update API
server arg with updated etcd node ip addresses. This suggestion is no
longer needed.
2024-01-22 17:22:27 +01:00
Noam 3e7b568d3e
crictl allow setting grace period for stop containers upon reset (#10651)
* crictl allow setting different grace period for stop containers and pods

* correct grace period location
2024-01-22 17:11:08 +01:00
kyrie a45a40a398
update kube-version-min-required to v1.27 (#10817) 2024-01-22 14:26:12 +01:00
Takuya Murakami 4cb1f529d1
[kubernetes] Add hashes for kubernetes 1.29.0 and 1.29.1 (#10778)
* Add hashes of crictl and crio
* Add versions of etcd, crictl, crio and csi-snapshotter
2024-01-22 09:39:15 +01:00
Mohamed Omar Zaian 64447e745e
[kubernetes] Make kubernetes v1.28.6 default (#10810) 2024-01-19 09:07:27 +01:00
Max Gautier b7a83531e7
etcd: update to v3.5.10 (#10798) 2024-01-17 09:50:48 +01:00
Kay Yan a0a2f40295
add containerd config override_path (#10776) 2024-01-16 14:15:53 +01:00
lobiyed.karim 7b7c9f509e
Add PodDisruptionBudget for CoreDNS deployment. Allows users to control disruption behavior and set maximum unavailable pods (#10557) 2024-01-16 10:04:47 +01:00
Louis Tu 3f78bf9298
Fix incorrect ciliumcli binary (#10575)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2024-01-16 05:23:00 +01:00
Gaëtan Trellu 50fbfa2a9a
Fix PyYAML package name on SLES and openSUSE (#10794) 2024-01-15 04:21:08 +01:00
Gaëtan Trellu 747d8bb4c2
Fix ntp installation on SLES and openSUSE (#10786) 2024-01-12 04:03:35 +01:00
Serge Hartmann bb67d9524d
Fix crio_version version comparison (#10780)
Signed-off-by: serge Hartmann <serge.hartmann@gmail.com>
2024-01-11 11:49:35 +01:00
Kay Yan 8c09c3fda2
fix image pull in insecure-registry (#10775) 2024-01-09 10:20:16 +01:00
Louis Tu a656b7ed9a
Add kube_vip_lb_fwdmethod option for kube-vip (#10762)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2024-01-09 08:22:13 +01:00
Kay Yan 2e8b72e278
fix disable swap in centos (#10751) 2024-01-08 17:38:14 +01:00
Louis Tu ddf5c6ee12
Update coredns rolling update strategy (#10748)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2024-01-08 17:38:05 +01:00
Ryan Lonergan eda7ea5695
feat: add support for Cilium 1.14 (#10684)
* update cilium configmap template for new routing mode and tunnel-protocol options
Ryan Lonergan ryan.tlonergan@gmail.com

* add rbac for new cilium crd in 1.14
Ryan Lonergan ryan.tlonergan@gmail.com

* add conditional for cni-install.sh that's no longer included in cilium 1.14
Ryan Lonergan ryan.tlonergan@gmail.com

* Update roles/network_plugin/cilium/templates/cilium/ds.yml.j2

Co-authored-by: Cyclinder <qifeng.guo@daocloud.io>

---------

Co-authored-by: Cyclinder <qifeng.guo@daocloud.io>
2024-01-08 02:43:02 +01:00
刘旭 08c0b34270
[cert-manager] upgrade to v1.13.2 (#10616) 2024-01-05 04:45:10 +01:00
Romain 1a86b4cb6d
Fix download retry when get_url has no status_code. (#10613)
* Fix download retry when get_url has no status_code.

* Fix until clause in download role.
2024-01-04 04:00:47 +01:00
Mohamed Omar Zaian aea150e5dc
[kubernetes] Make kubernetes v1.28.5 default (#10739)
* Add hashes for kubernetes 1.29.0, 1.28.5, 1.27.9, 1.26.12
2023-12-21 17:30:45 +01:00
Andrei Costescu c3b674526d
Fix modprobe module on Flatcar (#10678)
* Fix modprobe module on Flatcar

* Add todo about upstream issue report
2023-12-21 16:16:34 +01:00
Kay Yan 565eab901b
remove containerd registries (#10738) 2023-12-21 10:01:12 +01:00
Max Gautier c3315ac742
systemd-resolved: use a drop-in for kubespray dns (#10732)
This avoid needlessly overriding things and make cleanup easier.
Also simplifies the template a bit.
2023-12-21 09:52:14 +01:00
Olivier Levitt 29ea790c30
Fix calico-node in etcd mode (#10438)
* Calico : add ETCD endpoints to install-cni container

* Calico : remove nodename from configmap in etcd mode
2023-12-19 04:09:06 +01:00
Ugur Can Ozturk ae780e6a9b
[etcd]: add etcd distributed tracing flags (#10666)
* [etcd]: add etcd distributed tracing flags

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [etcd]: add etcd distributed tracing flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [etcd]: add etcd distributed tracing flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

---------

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>
2023-12-19 04:00:10 +01:00
Max Gautier 471326f458
Remove PodSecurityPolicy support and references (#10723)
This is removed from kubernetes since 1.25, time to cut some dead code.
2023-12-18 14:13:43 +01:00
Michael Kebe d435edefc4
Removed DEPRECATED --logtostderr from metrics-server (#10709)
The --logtostderr is deprecated.

https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components
2023-12-14 22:49:28 +01:00
刘旭 eb73f1d27d
support disable dns autoscaler when use CoreDNS (#10608) 2023-12-14 10:03:34 +01:00
Mohamed Omar Zaian ccb742c7ab
[containerd] add hashes for versions 1.6.25-26 and 1.7.9-11 make v1.7.11 default (#10671) 2023-12-12 17:53:32 +01:00
jandres - moscardo cb848fa7cb
New PR default node selector (#10607) 2023-12-12 14:51:26 +01:00
Max Gautier 8abf49ae13
Disable podCIDR allocation from control-plane when using calico (#10639)
* Disable control plane allocating podCIDR for nodes when using calico

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.

* Don't default to kube_network_node_prefix for calico_pool_blocksize

They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
2023-12-12 14:38:36 +01:00
Max Gautier 81a3f81aa1
Revert "Update etcd-servers for apiserver (#8253)" (#10652)
This reverts commit ee0f1e9d58.

Avoid restarting all api servers at once by changing their config.
2023-12-12 11:22:38 +01:00
Max Gautier 0fb404c775
etcd: use dynamic group for certs generation check (#10610)
We take advantage of group_by to create the list of nodes needing new
certs, instead of manually looping inside a Jinja template.

This should make the role more readable and less susceptible to
white space problems.
2023-12-12 11:22:29 +01:00
Max Gautier 51069223f5
Decouple kubespray-defaults from download (#10626)
* Decouple role kubespray-defaults from download

Avoids doing re-importing the download role on every invocation of
kubespray-defaults (and skipping everything).

This has a measurable effect on playbook performance.

* Update docs refering to moved download defaults
2023-12-11 16:56:17 +01:00
David Leadbeater 17b51240c9
Remove legacy crio packaging cleanup (#10702)
This has now been removed and results in a 404 when trying to remove the
old key, even if it's not present.
2023-12-11 15:41:13 +01:00
piwinkler eb628efbc4
Update 0040-verify-settings.yml (#10699)
remove embedded template
2023-12-11 10:56:13 +01:00
Max Gautier 2c3ea84e6f
Use systemd for disabling swap when it's used (#10587)
* Mask systemd swap.target do disable swap

This is a more generic way to disable swap, since it pulls .swap units
in systemd distributions; fstab is only one way to generate .swap units.

* Unconditionally disable swap

We only care to disable it (the "swapon" registered variable is not used
anywhere else.
This allows to get rid of the ignore_errors, since this was added
because swapon.stdout does not exist in check_mode (see issue #6642).

* Don't explicitly disable swapOnZram

We're already masking the swap.target, which would pull the zram unit,
hence no need to handle zram-generator specifically.
2023-12-07 13:26:21 +01:00
Max Gautier 85f15900a4
Remove unneeded workaround for removing kubeadm DNS (#10695)
Kubeadm dns phase is correctly skipped.
This was a workaround for kubernetes/kubeadm#1557, which was actually
not a bug ; the correct fix was #4867
2023-12-07 12:54:15 +01:00
Mohamed Omar Zaian a9321aaf86
[calico] Add version 3.26.4 and make it default (#10669) 2023-12-06 03:05:33 +01:00
Kay Yan fe02d21d23
update nerdctl to v1.7.1 (#10685)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2023-12-05 19:00:41 +01:00
Kay Yan 5160e7e20b
using ctr pull instead of nerdctl (#10687) 2023-12-05 16:00:55 +01:00
Alexander c440106eff
add dnsPolicy: ClusterFirstWithHostNet to DaemonSets with hostNetwork: true value to avoid DNSConfigFormat events (#10618) 2023-12-05 02:52:17 +01:00
Mohamed Omar Zaian 75fecf1542
Update nodelocaldns version (#10621) 2023-11-29 12:19:36 +01:00
Max Gautier 0d7bdc6cca
pre-upgrade cleanup (#10656)
* Clean up redondant defaulting

drain_{timeout,grace_period}_after_failure don't exist at this point, so
they always default.

* Remove useless facts

The drain_*_after_failure are never used
2023-11-28 22:49:56 +01:00
chansuke c87d70b04b [cert-manager] Upgrade to v1.12.6 2023-11-28 22:42:50 +01:00
Max Gautier 612cfdceb1
Check conntrack module presence instead of kernel version (#10662)
* Try both conntrack modules instead of checking kernel version

Depending on kernel distributor, the kernel version might not be a
correct indicator of the conntrack module use.
Instead, we check both (and use the first found).

* Use modproble.persistent rather than manual persistence
2023-11-28 18:31:02 +01:00
ERIK 70bb19dd23
fix copy etcdctl retries (#10634)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2023-11-28 10:52:03 +01:00
Max Gautier 94d3f65f09
ipaddr (deprecated alias) => ansible.utils.ipaddr (#10650) 2023-11-28 09:56:55 +01:00
Valerii Kretinin cf3ac625da
revert env section deletion (#10655) 2023-11-28 09:47:46 +01:00
Max Gautier c2e3071a33
kubespray-defaults: Check for boostrap-os FQDN (#10590)
When installed as an ansible collection, roles in
ansible_play_role_names will be designated by their FQDN (i.e
'kubernetes-sigs.kubespray.<role-name>).

It means we need to check for both when checking for roles in the play.
2023-11-28 09:23:46 +01:00
Max Gautier 21e8b96e22
Drop the drain check for kubectl > v1.10.0 (#10657)
Older versions are unsupported for a long time.
2023-11-28 03:14:51 +01:00
Samuel Liu 3acacc6150
add kube_apiserver_etcd_compaction_interval (#10644) 2023-11-27 05:37:33 +01:00
Mohamed Omar Zaian b321ca3e64
[kubernetes] Add hashes for kubernetes 1.28.4, 1.27.8, 1.26.11 (#10624) 2023-11-24 03:22:55 +01:00
AbhishekKr 6b1188e3dc
[fix] modprobe_nf_conntrack for new Linux Kernel, when using ipvs (#10625)
Signed-off-by: AbhishekKr <abhikumar163@gmail.com>
2023-11-20 09:48:06 +01:00
Max Gautier 0d4f57aa22
Validate systemd unit files (#10597)
* Validate systemd unit files

This ensure that we fail early if we have a bad systemd unit file
(syntax error, using a version not available in the local version, etc)

* Hack to check systemd version for service files validation

factory-reset.target was introduced in system 250, same version as the
aliasing feature we need for verifying systemd services with ansible.
So we only actually executes the validation if that target is present.

This is an horrible hack which should be reverted as soon as we drop
support for distributions with systemd<250.
2023-11-17 20:01:23 +01:00
刘旭 bc5b38a771
support CoreDNS use host network and config dns port (#10617) 2023-11-17 14:41:53 +01:00
Lukáš Kubín f46910eac3
Add helm support for custom_cni deployment (#10529)
* Add helm support for custom_cni deployment

* Linting correction

* Ansible linting correction

* Add test packet with values

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Add custom_cni configuration file with comments

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Default values cleanup

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Add details to custom_cni configuration file

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Set correct yaml type of helm values

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Set CNI filesystem ownership to root

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Update cilium example parameter name

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

---------

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>
2023-11-16 00:32:21 +01:00
Khanh Ngo Van Kim adb8ff14b9
fix: invalid version check in containerd jinja-template config (#10620) 2023-11-15 16:06:42 +01:00
Arthur Outhenin-Chalandre 7ba85710ad
Update to ansible 2.15 (#10481)
* ansible: upgrade to version >= 2.15.5

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: update requirements

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* contrib/openstack: fix wrong gitignore pattern

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: add missing tzdata requirement

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: remove some molecules tests

Those doesn't work in Ansible 2.15. Ansible can't load builtin now
apparently and these tests are not worth it.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2023-11-15 09:39:09 +01:00
Noam cbd3a83a06
add option to enable cdi for containerd (#10603) 2023-11-14 17:20:19 +01:00
Eeo Jun eb015c0362
configure cluster-name for hubble relay (#10614) 2023-11-13 19:22:40 +01:00
Patrick O'Brien 17681a7e31
fallback_ips: ignore unreachable hosts (#10601)
Sets ignore_unreachable: true to `Gather ansible_default_ipv4 from all hosts`
task from fallback_ips.yml

Without this scale.yml will fail if a single node in the cluster is down, which
for large clusters happens often.
2023-11-10 21:07:18 +01:00
Mohamed Omar Zaian cca7615456
Update checksums (#10606) 2023-11-09 16:43:04 +01:00
Samuel Mutel a4b15690b8
fix: Same nameservers for resolv.conf and dhcp (#10548) 2023-11-08 16:57:45 +01:00
Louis Tu 32743868c7
Add cri-o criu support (#10479)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2023-11-08 16:57:32 +01:00
yun 7d221be408
Remove crio package configuration (#10584)
* Remove crio package configuration

* Remove crio package config directly without loop
2023-11-08 16:29:42 +01:00
Denis 2d75077d4a
fix: (#10197)
Remove cri-o apt repo job has state present but need absent
Uninstall CRI-O packages job has undefined variable crio_packages
replaced by list of packages
2023-11-08 16:22:39 +01:00
borgiacis 802da0bcb0
Create variables for ipvs kernel modules (#10580)
* Create variables for ipvs kernel modules

* Corrected kubernetes role node task missing name

* Added changes as suggested during review by VannTen
2023-11-08 12:44:02 +01:00
Seal1998 6305dd39e9
Metallb --lb-class cmd arg to support multiple LoadBalancer implementations (#10550)
* metallb --lb-class cmd arg to support multiple load balancer implementations

* removed loadbalancer_class from metallb_config; metallb_loadbalancer_class in role defaults
2023-11-08 12:43:48 +01:00
Max Gautier b3f6d05131
Move control plane certs renewal "spread out" into the systemd timer (#10596)
* Use RandomizedDelaySec to spread out control certificates renewal plane

If the number of control plane node is superior to 6, using (index * 10
minutes) will fail (03:60:00 is not a valid timestamp).

Compared to just fixing the jinja expression (to use a modulo for
example), this should avoid having two control planes certificates
update node being triggered at the same time.

* Make k8s-certs-renew.timer Persistent

If the control plane happens to be offline during the scheduled
certificates renewal (node failure or anything like that), we still want
the renewal to happen.
2023-11-08 12:35:20 +01:00
Max Gautier 8ebeb88e57
Refactor "multi" handlers to use listen (#10542)
* containerd: refactor handlers to use 'listen'

* cri-dockerd: refactor handlers to use 'listen'

* cri-o: refactor handlers to use 'listen'

* docker: refactor handlers to use 'listen'

* etcd: refactor handlers to use 'listen'

* control-plane: refactor handlers to use 'listen'

* kubeadm: refactor handlers to use 'listen'

* node: refactor handlers to use 'listen'

* preinstall: refactor handlers to use 'listen'

* calico: refactor handlers to use 'listen'

* kube-router: refactor handlers to use 'listen'

* macvlan: refactor handlers to use 'listen'
2023-11-08 12:28:30 +01:00
Mohamed Omar Zaian f3332af3f2
[containerd] add hashes for version 1.7.8 (#10589) 2023-11-03 16:45:15 +01:00
Boris Barnier 870065517f
[kube-router] set version to 2.0.0 (#10503)
Signed-off-by: Boris Barnier <bozzo@users.noreply.github.com>
2023-11-02 11:19:57 +01:00
Mohamed Omar Zaian 267a8c6025
[ingress-nginx] upgrade to 1.9.4 (#10583) 2023-11-02 04:02:24 +01:00
Hedayat Vatankhah (هدایت) edff3f8afd
Set remove_default_searchdomains to false by default (#10554)
It was not 'false', which made some tasks (e.g. using systemd-resolved
template) to effectively remove default search domains; caused DNS loop
after rebooting the node/restarting cluster, so localdns service didn't
run correctly.
2023-11-01 03:33:57 +01:00
yun cdc8d17d0b
Check nameserver when dns is enable (#10561) 2023-11-01 03:07:06 +01:00
Max Gautier 8f0e553e11
etcd/backup: native ansible modules instead of shell (#10540)
This make native ansible features (dry-run, changed state) easier to
have, and should have a minimal performance impact, since it only runs
on the etcd members.
2023-10-30 20:05:28 +01:00
chansuke 5f9a7b9d49
[cert-manager] Upgrade to v1.12.5 (#10500) 2023-10-30 18:51:35 +01:00
qlijin af7bc17c9a
Spicify the runc path when we use the containerd container engine and change the bin_dir path. (#10154)
* Specify the runc path when we use the containerd container engine
and change the bin_dir path.

Signed-off-by: Jin Li <qlijin@gmail.com>

* Update roles/container-engine/containerd/templates/config.toml.j2

Co-authored-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

---------

Signed-off-by: Jin Li <qlijin@gmail.com>
Co-authored-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2023-10-30 17:54:31 +01:00
yun becb6267fb
Set default remove_default_searchdomains to false (#10533) 2023-10-30 17:37:52 +01:00
Max Gautier 34754ccb38
Use calico_pool_blocksize from cluster when existing (#10516)
The blockSize attribute from Calico IPPool resources cannot be changed
once set [1]. Consequently, we use the one currently defined when
configuring the existing IPPool, avoiding upgrade errors by trying to
change it.

In particular, this can be useful when calico_pool_blocksize default
changes in kubespray, which would otherwise force users to add an
explicit setting to their inventories.

[1]: https://docs.tigera.io/calico/latest/reference/resources/ippool#spec
2023-10-30 17:37:43 +01:00
Mohamed Omar Zaian 7a0030b145
Change default cri-o versions for Kubernetes 1.26 (#10565) 2023-10-30 17:23:32 +01:00
Louis Tu fa9e41047e
Add kubectl alias support (#10552)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2023-10-30 17:23:19 +01:00
Mohamed Omar Zaian f5f1f9478c
[argocd] update argocd to v2.8.4 (#10568) 2023-10-30 12:54:26 +01:00
Mohamed Omar Zaian 6a70f02662
[helm] upgrade to 3.13.1 (#10567) 2023-10-30 04:32:52 +01:00
Mohamed Omar Zaian 3bc0dfb354
[etcd] add 3.5.10 hashes (#10566) 2023-10-30 04:32:45 +01:00
Mohamed Omar Zaian 418df29ff0
Add crictl 1.26.1 for Kubernetes v1.26 (#10564) 2023-10-30 04:28:44 +01:00
Mohamed Omar Zaian 1f47d5b74f
[kubernetes] Add hashes for kubernetes 1.28.3, 1.27.7, 1.26.10 (#10541) 2023-10-20 05:43:34 +02:00
Marc Brugger 3f1409d87d
Correct cilium metrics port mapping (#10519)
Signed-off-by: Marc Brugger <m.brugger@bison-group.com>
2023-10-19 05:09:13 +02:00
Max Gautier 0b2e5b2f82
Retries ssh connection for Gather node certs (#10515)
This allows this task to work with a forks count > 10 and the default
configuration of sshd, which is to limit sessions to 10. (see
MaxSessions in sshd_config).

Since this is a delegate_to task, it connects to the same host (first
etcd) for each node in the cluster, thus easily going above 10.

Raising the ssh connection attempts allow for more robustness, without
decreasing the forks count or serialising the tasks, which could slow
the task (or the playbook as a whole, if decreasing forks).
2023-10-19 05:04:29 +02:00
Unai Arríen 228efcba0e
Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/con… (#10464)
* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane

* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane

* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane
2023-10-17 21:39:40 +02:00
Max Gautier 401ea552c2
Cleanup a deprecation warning (ipaddr filter) (#10518) 2023-10-17 09:45:11 +02:00
Ugur Can Ozturk 8cce6df80a
[external-lb]: kubelet.conf server address and kube-proxy api-server address fix (#10490)
* [external-lb-kubeconfig]: fix server address in worker kubelet.conf

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [external-lb-kubeconfig]: fix server address in kube-proxy

Signed-off-by: Furkan Pehlivan <furkanpehlivan34@gmail.com>

---------

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>
Signed-off-by: Furkan Pehlivan <furkanpehlivan34@gmail.com>
Co-authored-by: Furkan Pehlivan <furkanpehlivan34@gmail.com>
2023-10-17 09:45:00 +02:00
Mohamed Omar Zaian 3e522a9f59
[calico] Make version 3.26.3 default (#10526) 2023-10-17 08:22:39 +02:00
Mohamed Omar Zaian ae45de3584
[containerd] add hashes for version 1.7.7 (#10525) 2023-10-17 07:32:10 +02:00
Mohamed Omar Zaian 513b6dd6ad
[ingress-nginx] upgrade to 1.9.3 (#10527) 2023-10-17 05:42:13 +02:00
emiran-orange e65050d3f4
Ability to define GPG key path for Docker APT (#10513) 2023-10-13 04:06:04 +02:00
Mohamed Omar Zaian 4a8a47d438
[ingress-nginx] upgrade to 1.9.0 (#10493) 2023-10-11 23:49:16 +02:00
ERIK b2d8ec68a4
Fix restart network task cannot be skipped (#10512)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2023-10-11 05:50:37 +02:00
Elias-elastisys d3101d65aa
Added templating to coredns error to allow for consolidation (#10501) 2023-10-10 14:32:41 +02:00
Ross Kusler acb86c23f9
[kube-router] Add option to disable bgp-graceful-restart (10488) (#10489) 2023-10-07 04:52:45 +02:00
Mohamed Omar Zaian 4846f33136
[etcd] make etcd 3.5.9 default (#10482) 2023-09-29 00:26:42 -07:00
Mohamed Omar Zaian de8d1f1a3b
[kubernetes] Kube-scheduler: remove/update deprecated component config v1beta3 (#10484) 2023-09-29 00:22:45 -07:00
Heather Lapointe ddd7aa844c
[kata-containers] Update configuration to support kata 3.1.3. (#10466)
Namely, the libexec paths have changed since 2.5.
This also makes kata_containers_virtio_fs_cache configurable.
2023-09-28 00:33:33 -07:00
Feruzjon Muyassarov 1fd31ccc28
Refactor NRI activation for containerd and CRI-O (#10470)
Refactor NRI (Node Resource Interface) activation in CRI-O and
containerd. Introduce a shared variable, nri_enabled, to streamline
the process. Currently, enabling NRI requires a separate update of
defaults for each container runtime independently, without any
verification of NRI support for the specific version of containerd
or CRI-O in use.

With this commit, the previous approach is replaced. Now, a single
variable, nri_enabled, handles this functionality. Also, this commit
separates the responsibility of verifying NRI supported versions of
containerd and CRI-O from cluster administrators, and leaves it to
Ansible.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2023-09-26 08:05:25 -07:00
Majid Garoosi 6f520eacf7
Bump nerdctl version 1.5.0 (#10475) 2023-09-26 05:05:36 -07:00
qlijin a0eb7c0d5c
[cri-o] update to v1.28.1 (#10480) 2023-09-26 04:36:57 -07:00
Boris Barnier 94322ef72e
[kube-router] set default version to 1.6.0 (#10478)
Signed-off-by: Boris Barnier <bozzo@users.noreply.github.com>
2023-09-25 02:32:57 -07:00
蔣 航 c6ab6406c2
Add Retry for Applying PriorityClass (#10469)
Signed-off-by: hang.jiang <hang.jiang@daocloud.io>
2023-09-24 19:54:56 -07:00
Romain 2c132dccba
Fix etcdctl.sh TLS file path when not using kubeadm. (#10467) 2023-09-24 19:50:57 -07:00
Christian 7919a47165
[metallb] add config option for IPAddressPool avoidBuggyIPs (#10458)
* Add avoid_buggy_ips as optional
* Revert avoid_buggy_ips default back to false
* Change auto_assign to optional, default true
2023-09-21 20:29:49 -07:00
Jason Witkowski 7b2586943b
Fix: kube-apiserver tag will overwrite secrets-at-rest token if used independently (#10460)
Signed-off-by: Jason Witkowski <jwitko1@gmail.com>
2023-09-21 06:55:29 -07:00
Feruzjon Muyassarov f964b3438d
Add configuration option for NRI in crio & containerd (#10454)
* [containerd] Add Configuration option for Node Resource Interface

Node Resource Interface (NRI) is a common is a common framework for
plugging domain or vendor-specific custom logic into container
runtime like containerd. With this commit, we introduce the
containerd_disable_nri configuration flag, providing cluster
administrators the flexibility to opt in or out (defaulted to 'out')
of this feature in containerd. In line with containerd's default
configuration, NRI is disabled by default in this containerd role
defaults.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>

* [cri-o] Add configuration option for Node Resource Interface

Node Resource Interface (NRI) is a common is a common framework for
plugging domain or vendor-specific custom logic into container
runtimes like containerd/crio. With this commit, we introduce the
crio_enable_nri configuration flag, providing cluster
administrators the flexibility to opt in or out (defaulted to 'out')
of this feature in cri-o runtime. In line with crio's default
configuration, NRI is disabled by default in this cri-o role
defaults.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>

---------

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
2023-09-21 00:30:19 -07:00
Mathieu Parent 09f3caedaa
[download] Don't fail on 304 Not Modified (#10452)
i.e when file was not modified since last download
2023-09-21 00:20:20 -07:00
Mohamed Omar Zaian fe4b1f6dee
[ingress-nginx] upgrade to 1.8.2 (#10455) 2023-09-20 19:17:56 -07:00
Mohamed Omar Zaian bc5e33791f
[vsphere_csi] Update to 3.1.0 (#10451) 2023-09-20 04:56:00 -07:00
Romain a81c6d5448
Add a way to configure reseted networking service name. (#10428) 2023-09-20 02:28:01 -07:00
Mohamed Omar Zaian 6b34e3ef08
[calico] Make version 3.26.1 default (#10416)
* [calico] Make version 3.26.1 default

* [calico] Separate calico-node and calico-cni-plugin service accounts

See: https://github.com/projectcalico/calico/pull/7106
2023-09-19 02:49:06 -07:00
Mohamed Omar Zaian dbdc4d4123
[kubernetes] Add hashes for kubernetes 1.28.2, 1.27.6, 1.26.9 (#10435) 2023-09-18 05:40:32 -07:00
Mohamed Omar Zaian c24c279df7
[containerd] add hashes for version 1.7.6, 1.6.24 (#10439) 2023-09-18 05:28:31 -07:00
Qasim Mehmood 0f243d751f
Use correct env var name for kube-vip per service leader election (#10433) 2023-09-14 02:22:17 -07:00
Toon Albers 31f6d38cd2
[cilium] fix: invalid hubble yaml if cilium_hubble_tls_generate is enabled (#10430) 2023-09-13 04:16:15 -07:00
Takuya Murakami 748b0b294d
[kubernetes] support 1.28.0 / 1.28.1 (#10376) (#10390)
* [kubernetes] support 1.28.0/1.28.1 (#10376)

* [kubernetes] Make 1.28.1 default (#10376)
2023-09-11 19:42:12 -07:00
NierYYDS af8210dfea
fix: add kubelet tag in task of fetch facts to avoid kubelet config inconsistencies (#10423)
when people run playbook with option `--tags=kubelet`, the kubelet config may changed, because some variables used in task populating `kubelet-config.yml`  could be different with running task(`Fetch facts`)
2023-09-11 05:12:11 -07:00
Florian Ruynat 493969588e
Use cluster_name variable instead of hardcoded value in cinder-csi controller plugin (#10422) 2023-09-08 07:18:16 -07:00
Kay Yan 5ffdb7355a
cleanup-for-2.23.0 (#10420) 2023-09-08 04:40:13 -07:00