kubernetes-handbook/practice/etcd-cluster-installation.md

137 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# 创建高可用 etcd 集群
kuberntes 系统使用 etcd 存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为`test-001.jimmysong.io`、`test-002.jimmysong.io`、`test-003.jimmysong.io`
+ test-001.jimmysong.io172.20.0.113
+ test-002.jimmysong.io172.20.0.114
+ test-003.jimmysong.io172.20.0.115
## TLS 认证文件
需要为 etcd 集群创建加密通信的 TLS 证书,这里复用以前创建的 kubernetes 证书
``` bash
cp ca.pem kubernetes-key.pem kubernetes.pem /etc/kubernetes/ssl
```
+ kubernetes 证书的 `hosts` 字段列表中包含上面三台机器的 IP否则后续证书校验会失败
## 下载二进制文件
到 `https://github.com/coreos/etcd/releases` 页面下载最新版本的二进制文件
``` bash
wget https://github.com/coreos/etcd/releases/download/v3.1.5/etcd-v3.1.5-linux-amd64.tar.gz
tar -xvf etcd-v3.1.5-linux-amd64.tar.gz
mv etcd-v3.1.5-linux-amd64/etcd* /usr/local/bin
```
或者直接使用yum命令安装
```bash
yum install etcd
```
若使用yum安装默认etcd命令将在`/usr/bin`目录下,注意修改下面`的etcd.service`文件中的启动命令地址为`/usr/bin/etcd`。
## 创建 etcd 的 systemd unit 文件
在/usr/lib/systemd/system/目录下创建文件etcd.service内容如下。注意替换IP地址为你自己的etcd集群的主机IP。
``` bash
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--name ${ETCD_NAME} \
--cert-file=/etc/kubernetes/ssl/kubernetes.pem \
--key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
--peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \
--peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
--listen-peer-urls ${ETCD_LISTEN_PEER_URLS} \
--listen-client-urls ${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
--advertise-client-urls ${ETCD_ADVERTISE_CLIENT_URLS} \
--initial-cluster-token ${ETCD_INITIAL_CLUSTER_TOKEN} \
--initial-cluster infra1=https://172.20.0.113:2380,infra2=https://172.20.0.114:2380,infra3=https://172.20.0.115:2380 \
--initial-cluster-state new \
--data-dir=${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
```
+ 指定 `etcd` 的工作目录为 `/var/lib/etcd`,数据目录为 `/var/lib/etcd`需在启动服务前创建这个目录否则启动服务的时候会报错“Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory”
+ 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书trusted-ca-file
+ 创建 `kubernetes.pem` 证书时使用的 `kubernetes-csr.json` 文件的 `hosts` 字段**包含所有 etcd 节点的IP**,否则证书校验会出错;
+ `--initial-cluster-state` 值为 `new` 时,`--name` 的参数值必须位于 `--initial-cluster` 列表中;
完整 unit 文件见:[etcd.service](../systemd/etcd.service)
环境变量配置文件`/etc/etcd/etcd.conf`。
```ini
# [member]
ETCD_NAME=infra1
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://172.20.0.113:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.20.0.113:2379"
#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.20.0.113:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.20.0.113:2379"
```
这是172.20.0.113节点的配置其他两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。ETCD_NAME换成对应节点的infra1/2/3。
## 启动 etcd 服务
``` bash
mv etcd.service /usr/lib/systemd/system/
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd
```
在所有的 kubernetes master 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。
## 验证服务
在任一 kubernetes master 机器上执行如下命令:
``` bash
$ etcdctl \
--ca-file=/etc/kubernetes/ssl/ca.pem \
--cert-file=/etc/kubernetes/ssl/kubernetes.pem \
--key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
cluster-health
2017-04-11 15:17:09.082250 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-04-11 15:17:09.083681 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
member 9a2ec640d25672e5 is healthy: got healthy result from https://172.20.0.115:2379
member bc6f27ae3be34308 is healthy: got healthy result from https://172.20.0.114:2379
member e5c92ea26c4edba0 is healthy: got healthy result from https://172.20.0.113:2379
cluster is healthy
```
结果最后一行为 `cluster is healthy` 时表示集群服务正常。
## 更多资料
关于如何在etcd中查看kubernetes的数据请参考[使用etcdctl访问kuberentes数据](../guide/using-etcdctl-to-access-kubernetes-data.md)。