Commit Graph

7412 Commits (cb848fa7cb4c4e4599e344d746e263a1016a1b42)
 

Author SHA1 Message Date
jandres - moscardo cb848fa7cb
New PR default node selector (#10607) 2023-12-12 14:51:26 +01:00
Max Gautier 8abf49ae13
Disable podCIDR allocation from control-plane when using calico (#10639)
* Disable control plane allocating podCIDR for nodes when using calico

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.

* Don't default to kube_network_node_prefix for calico_pool_blocksize

They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
2023-12-12 14:38:36 +01:00
Louis Tu 8f2390a120
Fix the path of download.yml (#10711)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2023-12-12 13:47:27 +01:00
Max Gautier 81a3f81aa1
Revert "Update etcd-servers for apiserver (#8253)" (#10652)
This reverts commit ee0f1e9d58.

Avoid restarting all api servers at once by changing their config.
2023-12-12 11:22:38 +01:00
Max Gautier 0fb404c775
etcd: use dynamic group for certs generation check (#10610)
We take advantage of group_by to create the list of nodes needing new
certs, instead of manually looping inside a Jinja template.

This should make the role more readable and less susceptible to
white space problems.
2023-12-12 11:22:29 +01:00
Max Gautier 51069223f5
Decouple kubespray-defaults from download (#10626)
* Decouple role kubespray-defaults from download

Avoids doing re-importing the download role on every invocation of
kubespray-defaults (and skipping everything).

This has a measurable effect on playbook performance.

* Update docs refering to moved download defaults
2023-12-11 16:56:17 +01:00
David Leadbeater 17b51240c9
Remove legacy crio packaging cleanup (#10702)
This has now been removed and results in a 404 when trying to remove the
old key, even if it's not present.
2023-12-11 15:41:13 +01:00
Max Gautier 306103ed05
Add VannTen as reviewer (#10661) 2023-12-11 11:45:43 +01:00
piwinkler eb628efbc4
Update 0040-verify-settings.yml (#10699)
remove embedded template
2023-12-11 10:56:13 +01:00
Max Gautier 2c3ea84e6f
Use systemd for disabling swap when it's used (#10587)
* Mask systemd swap.target do disable swap

This is a more generic way to disable swap, since it pulls .swap units
in systemd distributions; fstab is only one way to generate .swap units.

* Unconditionally disable swap

We only care to disable it (the "swapon" registered variable is not used
anywhere else.
This allows to get rid of the ignore_errors, since this was added
because swapon.stdout does not exist in check_mode (see issue #6642).

* Don't explicitly disable swapOnZram

We're already masking the swap.target, which would pull the zram unit,
hence no need to handle zram-generator specifically.
2023-12-07 13:26:21 +01:00
Max Gautier 85f15900a4
Remove unneeded workaround for removing kubeadm DNS (#10695)
Kubeadm dns phase is correctly skipped.
This was a workaround for kubernetes/kubeadm#1557, which was actually
not a bug ; the correct fix was #4867
2023-12-07 12:54:15 +01:00
Kundan Kumar af1f318852
Updated AWS ALB ingress controller version (#10680) 2023-12-07 10:29:16 +01:00
Max Gautier b31afe235f
Final ipaddr deprecation cleanup (#10675)
Followup of #10518
2023-12-06 03:49:25 +01:00
Mohamed Omar Zaian a9321aaf86
[calico] Add version 3.26.4 and make it default (#10669) 2023-12-06 03:05:33 +01:00
Max Gautier d2944d2813
Check jinja templates for syntax error (#10667)
Allow to fail early (pre-commit time) for jinja error, rather than
waiting until executing the playbook and the invalid template.

I could not find a simple jinja pre-commit hook in the wild.
2023-12-06 03:05:24 +01:00
Kay Yan fe02d21d23
update nerdctl to v1.7.1 (#10685)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2023-12-05 19:00:41 +01:00
Kay Yan 5160e7e20b
using ctr pull instead of nerdctl (#10687) 2023-12-05 16:00:55 +01:00
Alexander c440106eff
add dnsPolicy: ClusterFirstWithHostNet to DaemonSets with hostNetwork: true value to avoid DNSConfigFormat events (#10618) 2023-12-05 02:52:17 +01:00
Max Gautier a1c47b1b20
Factorize some identical playbooks steps into their own sub-playbooks (#10633)
* Factorize identical playboooks steps in sub-playbooks

* Copy legacy_groups.yml into its sole user
2023-12-04 23:24:00 +01:00
Max Gautier 93724ed29c
Use non-deprecated stdout_callback (#10647)
Skippy is deprecated as its functionality has been incorporated into
the default callback plugin.
2023-12-04 09:38:20 +01:00
Mohamed Omar Zaian 75fecf1542
Update nodelocaldns version (#10621) 2023-11-29 12:19:36 +01:00
Max Gautier 0d7bdc6cca
pre-upgrade cleanup (#10656)
* Clean up redondant defaulting

drain_{timeout,grace_period}_after_failure don't exist at this point, so
they always default.

* Remove useless facts

The drain_*_after_failure are never used
2023-11-28 22:49:56 +01:00
chansuke c87d70b04b [cert-manager] Upgrade to v1.12.6 2023-11-28 22:42:50 +01:00
Jelmer Vernooij fa7a504fa5
Drop installation notes for Debian Jessie (#10642)
Jessie has not received security updates for at least three years. See https://www.debian.org/releases/jessie/
2023-11-28 22:35:28 +01:00
Max Gautier 612cfdceb1
Check conntrack module presence instead of kernel version (#10662)
* Try both conntrack modules instead of checking kernel version

Depending on kernel distributor, the kernel version might not be a
correct indicator of the conntrack module use.
Instead, we check both (and use the first found).

* Use modproble.persistent rather than manual persistence
2023-11-28 18:31:02 +01:00
ERIK 70bb19dd23
fix copy etcdctl retries (#10634)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2023-11-28 10:52:03 +01:00
Max Gautier 94d3f65f09
ipaddr (deprecated alias) => ansible.utils.ipaddr (#10650) 2023-11-28 09:56:55 +01:00
Valerii Kretinin cf3ac625da
revert env section deletion (#10655) 2023-11-28 09:47:46 +01:00
Max Gautier c2e3071a33
kubespray-defaults: Check for boostrap-os FQDN (#10590)
When installed as an ansible collection, roles in
ansible_play_role_names will be designated by their FQDN (i.e
'kubernetes-sigs.kubespray.<role-name>).

It means we need to check for both when checking for roles in the play.
2023-11-28 09:23:46 +01:00
Max Gautier 21e8b96e22
Drop the drain check for kubectl > v1.10.0 (#10657)
Older versions are unsupported for a long time.
2023-11-28 03:14:51 +01:00
Samuel Liu 3acacc6150
add kube_apiserver_etcd_compaction_interval (#10644) 2023-11-27 05:37:33 +01:00
Max Gautier d583d331b5
Convert exoscale tf provider to new version (#10646)
This is untested. It passes terraform validate to un-broke the CI.
2023-11-24 17:22:55 +01:00
Mohamed Omar Zaian b321ca3e64
[kubernetes] Add hashes for kubernetes 1.28.4, 1.27.8, 1.26.11 (#10624) 2023-11-24 03:22:55 +01:00
AbhishekKr 6b1188e3dc
[fix] modprobe_nf_conntrack for new Linux Kernel, when using ipvs (#10625)
Signed-off-by: AbhishekKr <abhikumar163@gmail.com>
2023-11-20 09:48:06 +01:00
Max Gautier 0d4f57aa22
Validate systemd unit files (#10597)
* Validate systemd unit files

This ensure that we fail early if we have a bad systemd unit file
(syntax error, using a version not available in the local version, etc)

* Hack to check systemd version for service files validation

factory-reset.target was introduced in system 250, same version as the
aliasing feature we need for verifying systemd services with ansible.
So we only actually executes the validation if that target is present.

This is an horrible hack which should be reverted as soon as we drop
support for distributions with systemd<250.
2023-11-17 20:01:23 +01:00
刘旭 bc5b38a771
support CoreDNS use host network and config dns port (#10617) 2023-11-17 14:41:53 +01:00
Lukáš Kubín f46910eac3
Add helm support for custom_cni deployment (#10529)
* Add helm support for custom_cni deployment

* Linting correction

* Ansible linting correction

* Add test packet with values

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Add custom_cni configuration file with comments

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Default values cleanup

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Add details to custom_cni configuration file

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Set correct yaml type of helm values

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Set CNI filesystem ownership to root

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

* Update cilium example parameter name

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>

---------

Signed-off-by: Lukáš Kubín <lukas.kubin@gmail.com>
2023-11-16 00:32:21 +01:00
Khanh Ngo Van Kim adb8ff14b9
fix: invalid version check in containerd jinja-template config (#10620) 2023-11-15 16:06:42 +01:00
Arthur Outhenin-Chalandre 7ba85710ad
Update to ansible 2.15 (#10481)
* ansible: upgrade to version >= 2.15.5

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: update requirements

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* contrib/openstack: fix wrong gitignore pattern

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: add missing tzdata requirement

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* tests: remove some molecules tests

Those doesn't work in Ansible 2.15. Ansible can't load builtin now
apparently and these tests are not worth it.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2023-11-15 09:39:09 +01:00
Noam cbd3a83a06
add option to enable cdi for containerd (#10603) 2023-11-14 17:20:19 +01:00
Eeo Jun eb015c0362
configure cluster-name for hubble relay (#10614) 2023-11-13 19:22:40 +01:00
Patrick O'Brien 17681a7e31
fallback_ips: ignore unreachable hosts (#10601)
Sets ignore_unreachable: true to `Gather ansible_default_ipv4 from all hosts`
task from fallback_ips.yml

Without this scale.yml will fail if a single node in the cluster is down, which
for large clusters happens often.
2023-11-10 21:07:18 +01:00
Mohamed Omar Zaian cca7615456
Update checksums (#10606) 2023-11-09 16:43:04 +01:00
Samuel Mutel a4b15690b8
fix: Same nameservers for resolv.conf and dhcp (#10548) 2023-11-08 16:57:45 +01:00
Louis Tu 32743868c7
Add cri-o criu support (#10479)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2023-11-08 16:57:32 +01:00
yun 7d221be408
Remove crio package configuration (#10584)
* Remove crio package configuration

* Remove crio package config directly without loop
2023-11-08 16:29:42 +01:00
Denis 2d75077d4a
fix: (#10197)
Remove cri-o apt repo job has state present but need absent
Uninstall CRI-O packages job has undefined variable crio_packages
replaced by list of packages
2023-11-08 16:22:39 +01:00
borgiacis 802da0bcb0
Create variables for ipvs kernel modules (#10580)
* Create variables for ipvs kernel modules

* Corrected kubernetes role node task missing name

* Added changes as suggested during review by VannTen
2023-11-08 12:44:02 +01:00
Seal1998 6305dd39e9
Metallb --lb-class cmd arg to support multiple LoadBalancer implementations (#10550)
* metallb --lb-class cmd arg to support multiple load balancer implementations

* removed loadbalancer_class from metallb_config; metallb_loadbalancer_class in role defaults
2023-11-08 12:43:48 +01:00
Max Gautier b3f6d05131
Move control plane certs renewal "spread out" into the systemd timer (#10596)
* Use RandomizedDelaySec to spread out control certificates renewal plane

If the number of control plane node is superior to 6, using (index * 10
minutes) will fail (03:60:00 is not a valid timestamp).

Compared to just fixing the jinja expression (to use a modulo for
example), this should avoid having two control planes certificates
update node being triggered at the same time.

* Make k8s-certs-renew.timer Persistent

If the control plane happens to be offline during the scheduled
certificates renewal (node failure or anything like that), we still want
the renewal to happen.
2023-11-08 12:35:20 +01:00