Commit Graph

374 Commits (release-2.17)

Author SHA1 Message Date
Ray Terrill 1edb7d771f
Modify connection_strings_etcd to only return etcd nodes (#7966)
Modify connection_strings_etcd to only return etcd nodes - not master nodes - since this results in duplicate hosts in the generated Ansible inventory and is unnecessary.
2021-09-15 00:58:40 -07:00
Fredrik Liv aa00c1d91a
Updated UpCloud terraform script to use private network and dynamic (#7779)
additional disks
2021-09-10 13:55:21 -07:00
Vladimir Masarik a5a88e41af
Fix: adding new ips with inventory builder (#7577) (#7583)
* Fix: adding new ips with inventory builder (#7577)

* moved conflig loading logic
to after checking whether the config
should be loaded, and added check for
whether the config should be loaded

* added check for removing nodes from config
if the user wants to remove a node, we
need to load the config

* Fix tox errors
2021-09-10 12:21:22 -07:00
Victor Morales c7d12cddec
Ensure python main function return values (#7860)
The main functions are wrapped by a sys.exit function which expects and
argument. The curent implementation isn't returning values in all cases.
This change ensures main functions return a value in all cases.
2021-08-19 06:51:24 -07:00
Kenichi Omichi 8f44cd35d8
Fix how to get image ID on offline deployment (#7808)
Previously IDs of container images were gotten from tar files of container
images but that way was wrong. If multiple json files are contained in a
tar file, the script got multiple IDs and tried to pass these IDs on
`docker tag` command. Then the command was failed.

This updates the script to get image IDs from `docker image inspect` command
to fix this issue.
In addition, this adds a check a registry container exists already or not
before deploying registry container to avoid a container conflict failure.
2021-07-26 00:56:33 -07:00
Samuel bfebcfa2c5
fix(misc): contrib/terraform/aws (#7818)
* fix(misc): terraform/aws

- handles deployment with a single availability zone
- handles deployment with more than two availability zone
- handles etcd collocation with control-plane nodes (`aws_etcd_num=0`)
- allows to set a bastion instances count (`aws_bastion_num`)
- allows to set bastion/etcd/control-plane/workers rootfs volume size
- removes variables from terraform.tfvars that were not re-used
- adds .terraform.lock.hcl to .gitignore
- changes/updates base image from ubuntu-18.03 to debian-10

tested by a few coworkers of mine, and myself: thanks for the outstanding
work, on both those terraform samples and kubespray playbooks.
I did not test ubuntu deployments, I could still swap from buster to
focal. LMK.

* fix(gitlab-ci)

AFAIU, terraform.tfvars indentation should be fixed for / no diff
returned running `terraform fmt -check -diff`

https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/1445622114
2021-07-23 02:43:16 -07:00
Kenichi Omichi b0fcc1ad1d
Add error handling for registorying images (#7787)
When running the script, I faced the following error but it was
difficult to know the root problem due to lack of error handling.

  docker tag" requires exactly 2 arguments.
  See 'docker tag --help'.

  Usage:  docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]

  Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE

To investigate such errors easily, this adds an error handling.
2021-07-18 17:58:51 -07:00
왕영주 (Youngju Wang) 3b3ccac212
Update README.md (#7784)
Update README for control_plane's external volume type variable
2021-07-13 22:52:26 -07:00
Cristian Calin 7516fe142f
Move to Ansible 3.4.0 (#7672)
* Ansible: move to Ansible 3.4.0 which uses ansible-base 2.10.10

* Docs: add a note about ansible upgrade post 2.9.x

* CI: ensure ansible is removed before ansible 3.x is installed to avoid pip failures

* Ansible: use newer ansible-lint

* Fix ansible-lint 5.0.11 found issues

* syntax issues
* risky-file-permissions
* var-naming
* role-name
* molecule tests

* Mitogen: use 0.3.0rc1 which adds support for ansible 2.10+

* Pin ansible-base to 2.10.11 to get package fix on RHEL8
2021-07-12 00:00:47 -07:00
jayonlau 823bd9118e
Clean up extra spaces of kubespray-aws-inventory.py (#7774)
Clean up extra spaces, although these errors are not important, they affect the code specification.
2021-07-08 01:32:53 -07:00
Kenichi Omichi 4a15994da0
Update link for kubepsray project (#7758)
https://github.com/kubernetes-incubator/kubespray is an old link,
this updates the link.
2021-07-05 01:12:55 -07:00
jayonlau cda88e6770
Clean up extra spaces (#7744)
I recently reviewed the code, although these errors are not important, they affect the code specification.
2021-06-25 01:44:46 -07:00
Simon Kollberg d7039ef707
Openstack cwd (#7643)
* terraform/openstack: Use path.root for ansible_bastion_template.txt

The path.root variable points to the root module path. Using this
instead of a relative path makes less assumptions about the current
working directory.

* terraform/openstack: Add group_vars_path variable

Previously, the group_vars path was assumed to be in CWD. The
default value for the group_vars_path variable is still relative
to CWD and thus should be backwards compatible if unset.
2021-06-25 00:26:45 -07:00
jiriproX 1739b27231
Replace yum module with package module (#7621) 2021-06-05 04:16:39 -07:00
rptaylor b46e751573
protect against TypeError in case of NoneType (#7659) 2021-06-01 08:24:27 -07:00
Marques Johansson 3a37a49690
Packet renamed (#7653)
* Packet->Equinix Metal rename #6901 

Updates throughout to reflect #6901 renaming for Packet to Equinix Metal.

* Rename Packet to Equinix Metal throughout the project #6901

Packet is renamed to Equinix Metal in more contexts including
documentation links. The Terraform provider used is still the Packet
provider. The environment variables and configuration options still
refer to the Packet name.

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

Co-authored-by: Edward Vielmetti <ed@packet.net>
2021-05-27 11:58:24 -07:00
Kenichi Omichi b3d9f2b4a2
Add contrib playbook to disable service firewall (#7431)
Basically we need to make necessary TCP/UDP ports open.
However the necessary ports are so many, and sometimes it is difficult
to figure out that is due to firewall issues or not if facing deployment
issues.
To distinguish a root problem on such situation, this adds contrib
playbook to disable the service firewall for Kubespray development
and test.
2021-05-18 06:45:30 -07:00
tkob b1b407a0b4
Replace map in Terraform scripts with tomap (#7576) (#7578)
* Replace map in Terraform scripts with tomap (#7576)

* Fix Terraform linter warnings (#7576)
2021-05-12 07:34:17 -07:00
muzi502 1d078e1119
Add script for generate download files and images list (#7561)
Fix coredns image repo and tag typo for #7570
2021-05-11 00:39:36 -07:00
Cristian Calin 360aff4a57
Rename ansible groups to use _ instead of - (#7552)
* rename ansible groups to use _ instead of -

k8s-cluster -> k8s_cluster
k8s-node -> k8s_node
calico-rr -> calico_rr
no-floating -> no_floating

Note: kube-node,k8s-cluster groups in upgrade CI
      need clean-up after v2.16 is tagged

* ensure old groups are mapped to the new ones
2021-04-29 05:20:50 -07:00
muzi502 324c95d37f
Fix some docs.ansible.com url typo (#7550) 2021-04-26 08:33:02 -07:00
Florian Ruynat b599f3084f
Fix OpenStack StyleGuide rule H216 (On by default in latest version) (#7535)
ref: b921c4de51
2021-04-21 09:04:11 -07:00
Cristian Klein 3ac92689f0
exoscale: Rework EIP access from workers (#7337)
Context: Load-balancing in Exoscale is performed by associating many
workers with the same EIP. This works, however, the workers cannot access
themselves via the EIP, which is needed at least for cert-managers
"self-test".

Problem: The old iptables based workaround felt fragile and disappointed
me at least once.

New solution: Add the EIP to a loopback interface on each worker.
2021-04-16 03:22:22 -07:00
dsy3502 5377aac936
fix typo (#7436) 2021-04-05 01:20:19 -07:00
Etienne Champetier f0cdf71ccb
Remove vault (#7400)
* Remove contrib/vault

This is marked as broken since 2018 / 3dcb914607
This still reference apiserver.pem, not used since ddffdb63bf

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>

* Finish nuking vault from the codebase

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-24 09:26:08 -07:00
Kenichi Omichi 486b223e01
Replace kube-master with kube_control_plane (#7256)
This replaces kube-master with kube_control_plane because of [1]:

  The Kubernetes project is moving away from wording that is
  considered offensive. A new working group WG Naming was created
  to track this work, and the word "master" was declared as offensive.
  A proposal was formalized for replacing the word "master" with
  "control plane". This means it should be removed from source code,
  documentation, and user-facing configuration from Kubernetes and
  its sub-projects.

NOTE: The reason why this changes it to kube_control_plane not
      kube-control-plane is for valid group names on ansible.

[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2021-03-23 17:26:05 -07:00
Florian Ruynat 18c0e54e4f
Add most_recent = true while retrieving the latest image (#7376) 2021-03-15 07:05:06 -07:00
Ewnetu Bayuh Lakew 5c5bf41afe
Terraform support for UpCloud (#7360)
* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* Updates to README.md and main.tf files

* formatting and updating readme

* added a .terraform_validate CI job

* fixed format issue

* added sample inventory

* added symbolic link to group_vars

* added missing tf variables and minor fixes

* added text formatting

* minor formatting fixes
2021-03-15 01:41:04 -07:00
Viktor bdd36c2d34
Update default exoscale master with more RAM (#7328)
The default master size for exoscale is 2cpu and 2GB ram.
I have found this to be too low, so this increases it to
2cpu and 4GB ram.
2021-03-01 09:41:25 -08:00
Jakub Krzywda 0a0156c946
Vsphere (#7306)
* Add terraform scripts for vSphere

* Fixup: Add terraform scripts for vSphere

* Add inventory generation

* Use machines var to provide IPs

* Add README file

* Add default.tfvars file

* Fix newlines at the end of files

* Remove master.count and worker.count variables

* Fixup cloud-init formatting

* Fixes after initial review

* Add warning about disabled DHCP

* Fixes after second review

* Add sample-inventory
2021-02-26 04:20:15 -08:00
Kenichi Omichi 2ea5793782
Replace KUBE_MASTERS with KUBE_CONTROL_HOSTS (#7257)
This replaces KUBE_MASTERS with KUBE_CONTROL_HOSTS because of [1]:

```
  The Kubernetes project is moving away from wording that is
  considered offensive. A new working group WG Naming was created
  to track this work, and the word "master" was declared as offensive.
  A proposal was formalized for replacing the word "master" with
  "control plane". This means it should be removed from source code,
  documentation, and user-facing configuration from Kubernetes and
  its sub-projects.
```

[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2021-02-23 10:00:03 -08:00
Hugo Blom 8682a57ea3
use image id instad of name (#7293) 2021-02-19 09:16:25 -08:00
Hugo Blom f2d10e9465
allow users to set image_uuid instead of name, this allows the use of openstack community images (#7283) 2021-02-16 07:05:06 -08:00
Hugo Blom 1c8bba36db
make sure worker rules is applied on workers (#7279) 2021-02-12 12:43:05 -08:00
Cristian Klein b77460ec34
contrib/terraform/exoscale: Rework SSH public keys (#7242)
* contrib/terraform/exoscale: Rework SSH public keys

Exoscale has a few limitations with `exoscale_ssh_keypair` resources.
Creating several clusters with these scripts may lead to an error like:

```
Error: API error ParamError 431 (InvalidParameterValueException 4350): The key pair "lj-sc-ssh-key" already has this fingerprint
```

This patch reworks handling of SSH public keys. Specifically, we rely on
the more cloud-agnostic way of configuring SSH public keys via
`cloud-init`.

* contrib/terraform/exoscale: terraform fmt

* contrib/terraform/exoscale: Add terraform validate

* contrib/terraform/exoscale: Inline public SSH keys

The Terraform scripts need to install some SSH key, so that Kubespray
(i.e., the "Ansible part") can take over. Initially, we pointed the
Terraform scripts to `~/.ssh/id_rsa.pub`. This proved to be suboptimal:
Operators sharing responbility for a cluster risk unnecessarily replacing resources.

Therefore, it has been determined that it's best to inline the public
SSH keys. The chosen variable `ssh_public_keys` provides some uniformity
with `contrib/azurerm`.

* Fix Terraform Exoscale test

* Fix Terraform 0.14 test
2021-02-03 07:32:28 -08:00
Fredrik Liv 404ea0270e
Added terraform support for Exoscale (#7141)
* Added terraform support for Exoscale

* Fixed markdown lint error on exoscale terraform
2021-01-22 20:37:39 -08:00
Andrea Zonca 24ceee134e
Document the terraform option `master_allowed_ports` (#7196)
Implemented in #6547
2021-01-21 07:55:06 -08:00
Mateusz Piotrowski 5517e62c86
Fix and document environment variable KUBE_MASTERS (#7127)
This variable was added as KUBE_MASTERS_MASTERS. That's probably a typo.
Remove the redundant `_MASTERS` suffix. Also, document the variable in the
help message.
2021-01-11 11:34:24 -08:00
Kenichi Omichi 2585e72a30
Fix mardownlint failures of offline (#7108)
This fixes the following failures:

./contrib/offline/README.md:14:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
./contrib/offline/README.md:20:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
2021-01-06 23:45:45 -08:00
Kenichi Omichi ad244ab744
Add manage-offline-container-images.sh (#7024)
One challenge of offline deployment was how to collect necessary
container images as a preparation. This adds a script to solve it.
2021-01-06 08:05:52 -08:00
Kenichi Omichi 398a995798
Fix markdownlint failures under ./roles/ (#7089)
This fixes markdownlint failures under roles/
2020-12-30 05:07:49 -08:00
Kenichi Omichi dc86b2063a
Fix markdown failures on contrib/terraform (#7082)
This fixes markdown failures on contrib/terraform.
2020-12-25 12:10:27 -08:00
Fredrik Liv bbab1013c5
Added gcp terraform support (#6974)
* Added gcp terraform support

* Added http/https firewall rule

* Ignoring lifecycle changes for attached disks on the google_compute_instance
2020-12-24 09:16:26 -08:00
Cristian Klein fd3ebc13f7
Fix terraform0.13 errors (#7077)
* [terraform/aws] Fix Terraform >=0.13 warnings

Terraform >=0.13 gives the following warning:

```
Warning: Interpolation-only expressions are deprecated
```

The fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings.

* [terraform/openstack] Fixes for Terraform >=0.13

Terraform >=0.13 gives the following error:
```
Error: Failed to install providers
Could not find required providers, but found possible alternatives:
  hashicorp/openstack -> terraform-provider-openstack/openstack
```

This patch fixes these errors.

This fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings for Terraform 0.13.5 and Terraform
0.14.3. Unfortunately, 0.12.x gives a harmless warning, but
with 0.14.3 out the door, I guess we need to move on.

* [terraform/packet] Fixes for Terraform >=0.13

This fix was tested as follows:
```
export PACKET_AUTH_TOKEN=blah-blah
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```

Errors are gone, but warnings still remain. It is impossible to please
all three versions of Terraform.

* Add tests for Terraform >=0.13
2020-12-23 05:08:26 -08:00
Kenichi Omichi 5b5726bdd4
Improve markdownlint for contrib/network-storage (#7079)
This fixes markdownlint failures under contrib/network-storage and
contrib/vault.
2020-12-23 00:00:26 -08:00
Kenichi Omichi 1347bb2e4b
Improve markdownlint coverage (#7075)
Now markdownlint covers ./README.md and md files under ./docs only.
However we have a lot of md files under different directories also.
This enables markdownlint for other md files also.
2020-12-22 04:44:26 -08:00
Noam 143f9c78be
fix MASSIVE_SCALE_THRESHOLD env paramter (#7054) 2020-12-18 08:50:25 -08:00
Pratik Raj 0982c66051
fix: added boto3 as dependency required by kubespray-aws-inventory.py (#6890)
Added "boto3" as dependency in "requirements.txt" which is required by "kubespray-aws-inventory.py".

Signed-off-by: Pratik raj <rajpratik71@gmail.com>
2020-11-26 15:06:19 -08:00
Hans Feldt ee23b947aa
fix flake8 errors in Kubespray CI - tox-inventory-builder (#6910)
* fix flake8 errors in Kubespray CI - tox-inventory-builder

* Invalidate CRI-O kubic repo's cache

Signed-off-by: Victor Morales <v.morales@samsung.com>

* add support to configure pkg install retries

and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)

* Switch Calico, Cilium and MetalLB image repos to Quay.io

Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
2020-11-22 23:47:35 -08:00
Sascha Marcel Schmidt 602b5aaf01
add warning about current state of heketi (#6888) 2020-11-13 00:06:23 -08:00