Commit Graph

25 Commits (5ce530c909a86540807bc294ef832135d92d5e87)

Author SHA1 Message Date
Yuhao Zhang 0e971a37aa
Offline control plane recover (#10660)
* ignore_unreachable for etcd dir cleanup

ignore_errors ignores errors occur within "file" module. However, when
the target node is offline, the playbook will still fail at this task
with node "unreachable" state. Setting "ignore_unreachable: true" allows
the playbook to bypass offline nodes and move on to proceed recovery
tasks on remaining online nodes.

* Re-arrange control plane recovery runbook steps

* Remove suggestion to manually update IP addresses

The suggestion was added in 48a182844c 4
years ago. But a new task added 2 years ago, in
ee0f1e9d58, automatically update API
server arg with updated etcd node ip addresses. This suggestion is no
longer needed.
2024-01-22 17:22:27 +01:00
Florian Ruynat 9696936b59
Fixup recover control plane playbook + add debian12/cilium test (#10411)
* Add debian12 cilium testing

* Fixup recover control plane playbook
2023-09-05 10:42:52 -07:00
Arthur Outhenin-Chalandre 36e5d742dc
Resolve ansible-lint name errors (#10253)
* project: fix ansible-lint name

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: ignore jinja template error in names

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: capitalize ansible name

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: update notify after name capitalization

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-07-26 07:36:22 -07:00
Arthur Outhenin-Chalandre 5d00b851ce
project: fix var-spacing ansible rule (#10266)
* project: fix var-spacing ansible rule

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing on the beginning/end of jinja template

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing of default filter

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing between filter arguments

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix double space at beginning/end of jinja

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix remaining jinja[spacing] ansible-lint warning

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-07-04 20:36:54 -07:00
Arthur Outhenin-Chalandre f8f197e26b
Fix outdated tag and experimental ansible-lint rules (#10254)
* project: fix outdated tag and experimental

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: remove no longer useful noqa 301

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: replace unnamed-task by name[missing]

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix daemon-reload -> daemon_reload

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-06-30 02:51:57 -07:00
Arthur Outhenin-Chalandre 25cb90bc2d
Upgrade ansible (#10190)
* project: update all dependencies including ansible

Upgrade to ansible 7.x and ansible-core 2.14.x. There seems to be issue
with ansible 8/ansible-core 2.15 so we remain on those versions for now.
It's quite a big bump already anyway.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* tests: install aws galaxy collection

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* ansible-lint: disable various rules after ansible upgrade

Temporarily disable a bunch of linting action following ansible upgrade.
Those should be taken care of separately.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve deprecated-module ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve no-free-form ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[meta] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[playbook] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[tasks] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve risky-file-permissions ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve risky-shell-pipe ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: remove deprecated warn args

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: use fqcn for non builtin tasks

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve syntax-check[missing-file] for contrib playbook

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: use arithmetic inside jinja to fix ansible 6 upgrade

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-06-26 03:15:45 -07:00
Max Gautier cb54eb40ce
Use a variable for standardizing kubectl invocation (#8329)
* Add kubectl variable

* Replace kubectl usage by kubectl variable in roles

* Remove redundant --kubeconfig on kubectl usage

* Replace unecessary shell usage with command
2022-01-05 02:26:32 -08:00
Florian Ruynat 9eacde212f
Fix quorum check when recovering broken etcd cluster (#8126) 2021-10-26 15:23:09 -07:00
Cristian Calin 7516fe142f
Move to Ansible 3.4.0 (#7672)
* Ansible: move to Ansible 3.4.0 which uses ansible-base 2.10.10

* Docs: add a note about ansible upgrade post 2.9.x

* CI: ensure ansible is removed before ansible 3.x is installed to avoid pip failures

* Ansible: use newer ansible-lint

* Fix ansible-lint 5.0.11 found issues

* syntax issues
* risky-file-permissions
* var-naming
* role-name
* molecule tests

* Mitogen: use 0.3.0rc1 which adds support for ansible 2.10+

* Pin ansible-base to 2.10.11 to get package fix on RHEL8
2021-07-12 00:00:47 -07:00
Florian Ruynat c1aa755a3c
Fix missing broken_etcd filter in recover control plane task (#7619) 2021-05-18 10:29:04 -07:00
harihud 0071e3c99c
Update main.yml (#7557) 2021-04-27 15:41:27 -07:00
Kenichi Omichi 486b223e01
Replace kube-master with kube_control_plane (#7256)
This replaces kube-master with kube_control_plane because of [1]:

  The Kubernetes project is moving away from wording that is
  considered offensive. A new working group WG Naming was created
  to track this work, and the word "master" was declared as offensive.
  A proposal was formalized for replacing the word "master" with
  "control plane". This means it should be removed from source code,
  documentation, and user-facing configuration from Kubernetes and
  its sub-projects.

NOTE: The reason why this changes it to kube_control_plane not
      kube-control-plane is for valid group names on ansible.

[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2021-03-23 17:26:05 -07:00
Kenichi Omichi 699fbd64ab
Move recover_control_plane/master to control-plane (#7236)
According to the following recommendation, this moves the directory
to control-plane:

The Kubernetes project is moving away from wording that is considered
offensive. A new working group WG Naming was created to track this work,
and the word "master" was declared as offensive. A proposal was formalized
for replacing the word "master" with "control plane".
2021-02-03 02:06:29 -08:00
Maxime Guyot 214e08f8c9
Fix ansible-lint E305 (#6459) 2020-07-28 01:39:08 -07:00
Maxime Guyot e70f27dd79
Add noqa and disable .ansible-lint global exclusions (#6410) 2020-07-27 06:24:17 -07:00
Florent Monbillard bf8c8976dd
Upgrade etcd to 3.4.3 (#5998) 2020-07-20 07:26:51 -07:00
Maxime Guyot 00fe3d5094
Explicitly set ETCDCTL_API and use ETCDCTL_ENDPOINTS (#6327) 2020-07-01 04:56:16 -07:00
Etienne Champetier 096de82fd9
Fixup recover_control_plane with Ansible 2.9 (#5806)
Tests as filters support is removed as of Ansible 2.9
https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.5.html#jinja-tests-used-as-filters

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-03-20 14:22:06 -07:00
Yujun Zhang ceab27c97a
Add OWNERS file for recover_control_plane (#5505)
Related to #5432
2020-03-16 03:46:35 -07:00
qvicksilver ac2135e450
Fix recover-control-plane to work with etcd 3.3.x and add CI (#5500)
* Fix recover-control-plane to work with etcd 3.3.x and add CI

* Set default values for testcase

* Add actual test jobs

* Attempt to satisty gitlab ci linter

* Fix ansible targets

* Set etcd_member_name as stated in the docs...

* Recovering from 0 masters is not supported yet

* Add other master to broken_kube-master group as well

* Increase number of retries to see if etcd needs more time to heal

* Make number of retries for ETCD loops configurable, increase it for recovery CI and document it
2020-02-11 01:38:01 -08:00
Yujun Zhang 32d80ca438 Add default value for `bin_dir` in recover control plane (#5396) 2019-12-09 02:54:02 -08:00
mervynzhang a8dfcbbfc7 Switch /root references to ansible_env.HOME (#4842)
* kube config dir for current/ansible become user

* remove extra /

* fix default value
2019-06-06 02:06:11 -07:00
MarkusTeufelberger e67f848abc ansible-lint: add spaces around variables [E206] (#4699) 2019-05-02 14:24:21 -07:00
Matthew Mosesohn a5b46bfc8c Run dns_late preinstall tasks on all k8s nodes (#4672)
* Run dns_late preinstall tasks on all k8s nodes

Related issue: #4656

Change-Id: I63f8559ef1a497b7580ab084561e6603fe647834

* Fix ansible-lint

Change-Id: Ia5b33fa63dbc36d8c3e9557ef3f2ea02af2325a5

* Fix recover_control_plane lint issues

Change-Id: I16643a3193c11b6ba704e9698812cac7e4fd19a8
2019-04-29 05:12:21 -07:00
qvicksilver 48a182844c Documentation and playbook for recovering control plane from node failure (#4146) 2019-04-29 01:40:20 -07:00