[WIP]使用Prometheus监控kubernetes集群
parent
d6e9358d24
commit
0842edd466
|
@ -74,6 +74,7 @@
|
||||||
- [4.3.5 使用Jenkins进行持续构建与发布](practice/jenkins-ci-cd.md)
|
- [4.3.5 使用Jenkins进行持续构建与发布](practice/jenkins-ci-cd.md)
|
||||||
- [4.3.6 数据持久化问题](practice/data-persistence-problem.md)
|
- [4.3.6 数据持久化问题](practice/data-persistence-problem.md)
|
||||||
- [4.3.7 管理容器的计算资源](practice/manage-compute-resources-container.md)
|
- [4.3.7 管理容器的计算资源](practice/manage-compute-resources-container.md)
|
||||||
|
- [4.3.8 使用Prometheus监控kubernetes集群](practice/using-prometheus-to-monitor-kuberentes-cluster.md)
|
||||||
- [4.4 存储管理](practice/storage.md)
|
- [4.4 存储管理](practice/storage.md)
|
||||||
- [4.4.1 GlusterFS](practice/glusterfs.md)
|
- [4.4.1 GlusterFS](practice/glusterfs.md)
|
||||||
- [4.4.1.1 使用GlusterFS做持久化存储](practice/using-glusterfs-for-persistent-storage.md)
|
- [4.4.1.1 使用GlusterFS做持久化存储](practice/using-glusterfs-for-persistent-storage.md)
|
||||||
|
|
|
@ -0,0 +1,67 @@
|
||||||
|
apiVersion: batch/v1
|
||||||
|
kind: Job
|
||||||
|
metadata:
|
||||||
|
name: grafana-import-dashboards
|
||||||
|
namespace: monitoring
|
||||||
|
labels:
|
||||||
|
app: grafana
|
||||||
|
component: import-dashboards
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
name: grafana-import-dashboards
|
||||||
|
labels:
|
||||||
|
app: grafana
|
||||||
|
component: import-dashboards
|
||||||
|
annotations:
|
||||||
|
pod.beta.kubernetes.io/init-containers: '[
|
||||||
|
{
|
||||||
|
"name": "wait-for-endpoints",
|
||||||
|
"image": "sz-pg-oam-docker-hub-001.tendcloud.com/library/giantswarm-tiny-tools",
|
||||||
|
"imagePullPolicy": "IfNotPresent",
|
||||||
|
"command": ["fish", "-c", "echo \"waiting for endpoints...\"; while true; set endpoints (curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt --header \"Authorization: Bearer \"(cat /var/run/secrets/kubernetes.io/serviceaccount/token) https://kubernetes.default/api/v1/namespaces/monitoring/endpoints/grafana); echo $endpoints | jq \".\"; if test (echo $endpoints | jq -r \".subsets[]?.addresses // [] | length\") -gt 0; exit 0; end; echo \"waiting...\";sleep 1; end"],
|
||||||
|
"args": ["monitoring", "grafana"]
|
||||||
|
}
|
||||||
|
]'
|
||||||
|
spec:
|
||||||
|
serviceAccountName: prometheus-k8s
|
||||||
|
containers:
|
||||||
|
- name: grafana-import-dashboards
|
||||||
|
image: sz-pg-oam-docker-hub-001.tendcloud.com/library/giantswarm-tiny-tools
|
||||||
|
command: ["/bin/sh", "-c"]
|
||||||
|
workingDir: /opt/grafana-import-dashboards
|
||||||
|
args:
|
||||||
|
- >
|
||||||
|
for file in *-datasource.json ; do
|
||||||
|
if [ -e "$file" ] ; then
|
||||||
|
echo "importing $file" &&
|
||||||
|
curl --silent --fail --show-error \
|
||||||
|
--request POST http://admin:admin@grafana:3000/api/datasources \
|
||||||
|
--header "Content-Type: application/json" \
|
||||||
|
--data-binary "@$file" ;
|
||||||
|
echo "" ;
|
||||||
|
fi
|
||||||
|
done ;
|
||||||
|
for file in *-dashboard.json ; do
|
||||||
|
if [ -e "$file" ] ; then
|
||||||
|
echo "importing $file" &&
|
||||||
|
( echo '{"dashboard":'; \
|
||||||
|
cat "$file"; \
|
||||||
|
echo ',"overwrite":true,"inputs":[{"name":"DS_PROMETHEUS","type":"datasource","pluginId":"prometheus","value":"prometheus"}]}' ) \
|
||||||
|
| jq -c '.' \
|
||||||
|
| curl --silent --fail --show-error \
|
||||||
|
--request POST http://admin:admin@grafana:3000/api/dashboards/import \
|
||||||
|
--header "Content-Type: application/json" \
|
||||||
|
--data-binary "@-" ;
|
||||||
|
echo "" ;
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
volumeMounts:
|
||||||
|
- name: config-volume
|
||||||
|
mountPath: /opt/grafana-import-dashboards
|
||||||
|
restartPolicy: Never
|
||||||
|
volumes:
|
||||||
|
- name: config-volume
|
||||||
|
configMap:
|
||||||
|
name: grafana-import-dashboards
|
|
@ -0,0 +1,40 @@
|
||||||
|
|
||||||
|
2017-09-25T11:53:14.559200871Z E0925 11:53:14.558983 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:14.560711186Z E0925 11:53:14.560539 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:14.561043368Z E0925 11:53:14.560920 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:14.56211475Z E0925 11:53:14.561906 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:15.560928538Z E0925 11:53:15.560732 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:15.562265859Z E0925 11:53:15.562102 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:15.563239559Z E0925 11:53:15.563067 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:15.564390281Z E0925 11:53:15.564196 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:16.562666898Z E0925 11:53:16.562450 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:16.563807986Z E0925 11:53:16.563638 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:16.564821972Z E0925 11:53:16.564628 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:16.565848893Z E0925 11:53:16.565669 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:17.56438821Z E0925 11:53:17.564155 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:17.565381358Z E0925 11:53:17.565189 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:17.566231354Z E0925 11:53:17.566131 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:17.567286798Z E0925 11:53:17.567131 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:18.570368569Z E0925 11:53:18.570150 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:18.570406501Z E0925 11:53:18.570163 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:18.570413661Z E0925 11:53:18.570184 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:18.57041935Z E0925 11:53:18.570218 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:19.57212411Z E0925 11:53:19.571840 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:19.573109252Z E0925 11:53:19.572911 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:19.574044784Z E0925 11:53:19.573810 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:19.575346655Z E0925 11:53:19.575102 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:20.573827161Z E0925 11:53:20.573560 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:20.574666239Z E0925 11:53:20.574441 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:20.57573655Z E0925 11:53:20.575493 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:20.576839576Z E0925 11:53:20.576603 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:21.575665021Z E0925 11:53:21.575429 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:21.576522006Z E0925 11:53:21.576324 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:21.577614983Z E0925 11:53:21.577404 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:21.578577469Z E0925 11:53:21.578373 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:22.577373226Z E0925 11:53:22.577121 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:22.578267576Z E0925 11:53:22.578043 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
2017-09-25T11:53:22.579199644Z E0925 11:53:22.579002 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/persistentvolumeclaim.go:60: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
|
||||||
|
2017-09-25T11:53:22.580366842Z E0925 11:53:22.580177 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/statefulset.go:68: Failed to list *v1beta1.StatefulSet: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list statefulsets.apps at the cluster scope. (get statefulsets.apps)
|
||||||
|
2017-09-25T11:53:23.578999887Z E0925 11:53:23.578734 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/job.go:106: Failed to list *v1.Job: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list jobs.batch at the cluster scope. (get jobs.batch)
|
||||||
|
2017-09-25T11:53:23.58002011Z E0925 11:53:23.579820 1 reflector.go:201] k8s.io/kube-state-metrics/collectors/cronjob.go:86: Failed to list *v2alpha1.CronJob: User "system:serviceaccount:monitoring:kube-state-metrics" cannot list cronjobs.batch at the cluster scope. (get cronjobs.batch)
|
||||||
|
|
|
@ -0,0 +1,4 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Namespace
|
||||||
|
metadata:
|
||||||
|
name: monitoring
|
|
@ -0,0 +1,75 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: prometheus-k8s
|
||||||
|
namespace: monitoring
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||||
|
kind: ClusterRole
|
||||||
|
metadata:
|
||||||
|
name: prometheus
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["nodes", "services", "endpoints", "pods"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["configmaps"]
|
||||||
|
verbs: ["get"]
|
||||||
|
- nonResourceURLs: ["/metrics"]
|
||||||
|
verbs: ["get"]
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: kube-state-metrics
|
||||||
|
namespace: monitoring
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: prometheus
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: prometheus
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: prometheus-k8s
|
||||||
|
namespace: monitoring
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||||
|
kind: ClusterRole
|
||||||
|
metadata:
|
||||||
|
name: kube-state-metrics
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["nodes","pods","services","resourcequotas","replicationcontrollers","limitranges"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["extensions"]
|
||||||
|
resources: ["daemonsets","deployments","replicasets"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["batch/v1"]
|
||||||
|
resources: ["job"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["v1"]
|
||||||
|
resources: ["persistentvolumeclaim"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["apps"]
|
||||||
|
resources: ["statefulset"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
- apiGroups: ["batch/v2alpha1"]
|
||||||
|
resources: ["cronjob"]
|
||||||
|
verbs: ["list", "watch"]
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: kube-state-metrics
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: kube-state-metrics
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: kube-state-metrics
|
||||||
|
namespace: monitoring
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,19 @@
|
||||||
|
apiVersion: extensions/v1beta1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: test
|
||||||
|
namespace: monitoring
|
||||||
|
labels:
|
||||||
|
app: test
|
||||||
|
spec:
|
||||||
|
replicas: 1
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: test
|
||||||
|
spec:
|
||||||
|
serviceAccountName: prometheus-k8s
|
||||||
|
containers:
|
||||||
|
- image: sz-pg-oam-docker-hub-001.tendcloud.com/library/centos:7.2.1511
|
||||||
|
name: test
|
||||||
|
imagePullPolicy: IfNotPresent
|
|
@ -0,0 +1,98 @@
|
||||||
|
# 使用Prometheus监控kubernetes集群
|
||||||
|
|
||||||
|
我们使用 Giantswarm 开源的 [kubernetes-promethues](https://github.com/giantswarm/kubernetes-prometheus) 来监控 kubernetes 集群,所有的 YAML 文件可以在 [manifests/prometheus](../manifests/prometheus) 目录下找到。
|
||||||
|
|
||||||
|
需要用到的镜像有:
|
||||||
|
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/prometheus-alertmanager:v0.7.1
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/grafana:4.2.0
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/giantswarm-tiny-tools:latest
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/prom-prometheus:v1.7.0
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/kube-state-metrics:v1.0.1
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/dockermuenster-caddy:0.9.3
|
||||||
|
- sz-pg-oam-docker-hub-001.tendcloud.com/library/prom-node-exporter:v0.14.0
|
||||||
|
|
||||||
|
同时备份到时速云:
|
||||||
|
|
||||||
|
- index.tenxcloud.com/jimmy/prometheus-alertmanager:v0.7.1
|
||||||
|
- index.tenxcloud.com/jimmy/grafana:4.2.0
|
||||||
|
- index.tenxcloud.com/jimmy/giantswarm-tiny-tools:latest
|
||||||
|
- index.tenxcloud.com/jimmy/prom-prometheus:v1.7.0
|
||||||
|
- index.tenxcloud.com/jimmy/kube-state-metrics:v1.0.1
|
||||||
|
- index.tenxcloud.com/jimmy/dockermuenster-caddy:0.9.3
|
||||||
|
- index.tenxcloud.com/jimmy/prom-node-exporter:v0.14.0
|
||||||
|
|
||||||
|
**注**:所有镜像都是从官方镜像仓库下载下。
|
||||||
|
|
||||||
|
## 部署
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
## 创建 monitoring namespaece
|
||||||
|
kubectl create -f prometheus-monitoring-ns.yaml
|
||||||
|
## 创建 RBAC
|
||||||
|
kubectl create -f prometheus-monitoring-rbac.yaml
|
||||||
|
## 部署 Premetheus
|
||||||
|
kubectl create -f prometheus-monitoring.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
创建 RBAC 的过程考虑替换成下面的命令:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl create clusterrolebinding prometheus-monitoring --clusterrole=cluster-admin --serviceaccount=monitoring:default
|
||||||
|
```
|
||||||
|
|
||||||
|
注意需要修改 YAML 文件中的 serviceaccount 和 clusterrolebinding 目前还未完成。
|
||||||
|
|
||||||
|
## 存在的问题
|
||||||
|
|
||||||
|
该项目的代码中存在几个问题。
|
||||||
|
|
||||||
|
### 1. RBAC 角色授权问题
|
||||||
|
|
||||||
|
需要用到两个 clusterrolebinding:
|
||||||
|
|
||||||
|
- `kube-state-metrics`,对应的`serviceaccount`是`kube-state-metrics`
|
||||||
|
- `prometheus`,对应的 `serviceaccount`是 `prometheus-k8s`
|
||||||
|
|
||||||
|
在部署 Prometheus 之前应该先创建 serviceaccount、clusterrole、clusterrolebinding 等对象,否则在安装过程中可能因为权限问题而导致各种错误,所以这些配置应该写在一个单独的文件中,而不应该跟其他部署写在一起,即使要写在一个文件中,也应该写在文件的最前面,因为使用 `kubectl` 部署的时候,kubectl 不会判断 YAML 文件中的资源依赖关系,只是简单的从头部开始执行部署,因此写在文件前面的对象会先部署。
|
||||||
|
|
||||||
|
也可以绕过复杂的 RBAC 设置,直接使用下面的命令设置为 serviceaccount 设置成 admin 模式。
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl create clusterrolebinding prometheus-monitoring --clusterrole=cluster-admin --serviceaccount=monitoring:default
|
||||||
|
```
|
||||||
|
|
||||||
|
这需要修改原配置中的 serviceaccount,并去掉原来的 clusterrolebinding。
|
||||||
|
|
||||||
|
参考 [RBAC——基于角色的访问控制](../guide/rbac.md)
|
||||||
|
|
||||||
|
### 2. API 兼容问题
|
||||||
|
|
||||||
|
从 `kube-state-metrics` 日志中可以看出用户 kube-state-metrics 没有权限访问如下资源类型:
|
||||||
|
|
||||||
|
- *v1.Job
|
||||||
|
- *v1.PersistentVolumeClaim
|
||||||
|
- *v1beta1.StatefulSet
|
||||||
|
- *v2alpha1.CronJob
|
||||||
|
|
||||||
|
而在我们使用的 kubernetes 1.6.0 版本的集群中 API 路径跟 `kube-state-metrics` 中不同,无法 list 以上三种资源对象的资源。详情见:https://github.com/giantswarm/kubernetes-prometheus/issues/77
|
||||||
|
|
||||||
|
### 3. Job 中的权限认证问题
|
||||||
|
|
||||||
|
在 `grafana-import-dashboards` 这个 job 中有个 `init-containers` 其中指定的 command 执行错误,应该使用
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -sX GET -H "Authorization:bearer `cat /var/run/secrets/kubernetes.io/serviceaccount/token`" -k https://kubernetes.default/api/v1/namespaces/monitoring/endpoints/grafana
|
||||||
|
```
|
||||||
|
|
||||||
|
不需要指定 csr 文件,只需要 token 即可。
|
||||||
|
|
||||||
|
参考 [wait-for-endpoints init-containers fails to load with k8s 1.6.0 #56](https://github.com/giantswarm/kubernetes-prometheus/issues/56)
|
||||||
|
|
||||||
|
## 参考
|
||||||
|
|
||||||
|
[Kubernetes Setup for Prometheus and Grafana](https://github.com/giantswarm/kubernetes-prometheus)
|
||||||
|
|
||||||
|
[RBAC——基于角色的访问控制](../guide/rbac.md)
|
||||||
|
|
||||||
|
[wait-for-endpoints init-containers fails to load with k8s 1.6.0 #56](https://github.com/giantswarm/kubernetes-prometheus/issues/56)
|
Loading…
Reference in New Issue