mirror of https://github.com/easzlab/kubeasz.git
增加删除etcd节点脚本和文档
parent
7b82688b1f
commit
9df8906e98
|
@ -55,7 +55,7 @@
|
|||
<td><strong>集群管理</strong><a href="docs/op/op-index.md">+</a></td>
|
||||
<td><a href="docs/op/AddNode.md">增加node节点</a></td>
|
||||
<td><a href="docs/op/AddMaster.md">增加master节点</a></td>
|
||||
<td><a href="docs/op/AddEtcd.md">增加etcd节点</a></td>
|
||||
<td><a href="docs/op/op-etcd.md">管理etcd集群</a></td>
|
||||
<td><a href="docs/op/del_one_node.md">删除节点</a></td>
|
||||
<td><a href="docs/op/upgrade.md">升级集群</a></td>
|
||||
<td><a href="docs/op/cluster_restore.md">备份恢复</a></td>
|
||||
|
|
|
@ -1,3 +1,21 @@
|
|||
# etcd 集群管理的 playbook
|
||||
|
||||
etcd 集群的主要操作包括`备份数据`,`添加/删除节点`等,本文介绍使用`ansible playbook`方便地完成这些任务。
|
||||
|
||||
- NOTE: 操作 etcd 集群节点增加/删除存在一定风险,请先在测试环境操作练手!
|
||||
|
||||
## 备份 etcd 数据
|
||||
|
||||
可以根据需要进行定期备份(使用 crontab),或者手动在任意正常 etcd 节点上执行备份:
|
||||
|
||||
``` bash
|
||||
# snapshot备份
|
||||
$ ETCDCTL_API=3 etcdctl snapshot save backup.db
|
||||
# 查看备份
|
||||
$ ETCDCTL_API=3 etcdctl --write-out=table snapshot status backup.db
|
||||
```
|
||||
- `kubeasz`项目也可以方便执行 `ansible-playbook /etc/ansible/23.backup.yml`,请阅读文档[备份恢复](cluster_restore.md)
|
||||
|
||||
## 增加 etcd 集群节点
|
||||
|
||||
etcd 集群支持在线改变集群成员节点,可以增加、修改、删除成员节点;不过改变成员数量仍旧需要满足集群成员多数同意原则(quorum),另外请记住集群成员数量变化的影响:
|
||||
|
@ -7,6 +25,7 @@ etcd 集群支持在线改变集群成员节点,可以增加、修改、删除
|
|||
- 增加 etcd 集群节点, 降低集群写性能(所有节点数据一致,每一次写入会需要所有节点数据同步)
|
||||
|
||||
新增`new-etcd`节点大致流程为:
|
||||
|
||||
- 在原有集群节点执行 member add 命令
|
||||
- 新节点预处理 prepare
|
||||
- 新节点安装 etcd 服务运行
|
||||
|
@ -49,18 +68,31 @@ $ journalctl -u etcd -f
|
|||
|
||||
- 注意:etcd 集群一次只能添加一个节点,如果你在[new-etcd]组中添加了2个新节点,那么需要执行两次 `ansible-playbook /etc/ansible/19.addetcd.yml`
|
||||
|
||||
### [可选]后续
|
||||
## 删除 etcd 集群节点
|
||||
|
||||
上述步骤验证成功,确认新etcd集群工作正常后,可以重新配置运行apiserver,以让 k8s 集群能够识别新的etcd节点:
|
||||
删除节点的操作步骤比较简单,运行:`ansible-playbook /etc/ansible/tools/remove_etcd_node.yml`后按照提示输入待删除节点的IP地址即可。
|
||||
|
||||
主要删除步骤:
|
||||
|
||||
- 提示/获取用户输入待删除节点IP,并判断是否可以删除
|
||||
- 获取待删除 etcd 节点的 ID 和 NAME 信息
|
||||
- 修改 ansible hosts 文件,把待删除节点从 etcd 组中删除
|
||||
- 执行 etcdctl member remove 命令删除节点
|
||||
- 删除节点的 etcd 数据目录
|
||||
- 重新配置启动整个 etcd 集群
|
||||
|
||||
## 重置 k8s 连接 etcd 参数
|
||||
|
||||
上述步骤验证成功,确认新etcd集群工作正常后,可以重新配置运行apiserver,以让 k8s 集群能够识别新的etcd集群:
|
||||
|
||||
``` bash
|
||||
# 重启 master 节点服务
|
||||
$ ansible-playbook /etc/ansible/04.kube-master.yml -t restart_master
|
||||
|
||||
# 验证 k8s 能够识别新 etcd 节点
|
||||
# 验证 k8s 能够识别新 etcd 集群
|
||||
$ kubectl get cs
|
||||
```
|
||||
|
||||
### 参考
|
||||
## 参考
|
||||
|
||||
- 官方文档 https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
|
|
@ -0,0 +1,52 @@
|
|||
# remove a etcd member
|
||||
- hosts: deploy
|
||||
vars_prompt:
|
||||
- name: "ETCD_TO_DEL"
|
||||
prompt: "which etcd node is about to be deleted?(e.g 192.168.1.1)"
|
||||
private: no
|
||||
confirm: yes
|
||||
tasks:
|
||||
- name: set warnning info
|
||||
set_fact: WARN_INFO="CAN NOT DELETE THIS NODE!!!!!!"
|
||||
when: "groups['etcd']|length < 2 or ETCD_TO_DEL not in groups['etcd']"
|
||||
|
||||
- name: show warnning info
|
||||
debug: var="WARN_INFO"
|
||||
when: "groups['etcd']|length < 2 or ETCD_TO_DEL not in groups['etcd']"
|
||||
|
||||
- block:
|
||||
- name: get ID of etcd node to delete
|
||||
shell: "ETCDCTL_API=3 {{ bin_dir }}/etcdctl member list|grep {{ ETCD_TO_DEL }}:2380|cut -d',' -f1"
|
||||
register: ETCD_ID
|
||||
delegate_to: "{{ groups.etcd[0] }}"
|
||||
|
||||
- name: get NAME of etcd node to delete
|
||||
shell: "ETCDCTL_API=3 {{ bin_dir }}/etcdctl member list|grep {{ ETCD_TO_DEL }}:2380|cut -d' ' -f3|cut -d',' -f1"
|
||||
register: ETCD_NAME
|
||||
delegate_to: "{{ groups.etcd[0] }}"
|
||||
|
||||
- name: rm etcd's node in hosts
|
||||
lineinfile:
|
||||
dest: "{{ base_dir }}/hosts"
|
||||
state: absent
|
||||
regexp: '{{ ETCD_NAME.stdout }}'
|
||||
connection: local
|
||||
when: "ETCD_NAME.stdout != ''"
|
||||
|
||||
- name: delete a etcd member
|
||||
shell: "ETCDCTL_API=3 {{ bin_dir }}/etcdctl member remove {{ ETCD_ID.stdout }}"
|
||||
delegate_to: "{{ groups.etcd[0] }}"
|
||||
when: "ETCD_ID.stdout != ''"
|
||||
|
||||
- name: rm data of the deleted etcd node
|
||||
file: name=/var/lib/etcd state=absent
|
||||
delegate_to: "{{ ETCD_TO_DEL }}"
|
||||
when: "ETCD_ID.stdout != ''"
|
||||
|
||||
- name: reconfig and restart the etcd cluster
|
||||
shell: "ansible-playbook /etc/ansible/02.etcd.yml > /tmp/ansible-playbook.log 2>&1"
|
||||
connection: local
|
||||
when: "ETCD_ID.stdout != ''"
|
||||
run_once: true
|
||||
# 满足条件才进行删除
|
||||
when: "groups['etcd']|length > 1 and ETCD_TO_DEL in groups['etcd']"
|
Loading…
Reference in New Issue