增加spark standalone模式的yaml文件

2017-08-30 14:19:17 +08:00 · 2017-08-30 14:19:17 +08:00 · d6f2ee4f87
parent cd499091eb
commit d6f2ee4f87
13 changed files with 548 additions and 7 deletions
--- a/SUMMARY.md
+++ b/SUMMARY.md
@ -77,7 +77,7 @@
      - [5.1.2.1 Linkerd 使用指南](usecases/linkerd-user-guide.md)
    - [5.1.3 微服务中的服务发现](usecases/service-discovery-in-microservices.md)
  - [5.2 大数据](usecases/big-data.md)
-    - [5.2.1 Spark on Kubernetes](usecases/spark-on-kubernetes.md)
+    - [5.2.1 Spark standalone on Kubernetes](usecases/spark-standalone-on-kubernetes.md)
 - [6. 开发指南](develop/index.md)
  - [6.1 开发环境搭建](develop/developing-environment.md)
  - [6.2 单元测试和集成测试](develop/testing.md)
--- a/manifests/spark-standalone/README.md
+++ b/manifests/spark-standalone/README.md
@ -0,0 +1,373 @@
 # Spark example
 Following this example, you will create a functional [Apache
 Spark](http://spark.apache.org/) cluster using Kubernetes and
 [Docker](http://docker.io).
 You will setup a Spark master service and a set of Spark workers using Spark's [standalone mode](http://spark.apache.org/docs/latest/spark-standalone.html).
 For the impatient expert, jump straight to the [tl;dr](#tldr)
 section.
 ### Sources
 The Docker images are heavily based on https://github.com/mattf/docker-spark.
 And are curated in https://github.com/kubernetes/application-images/tree/master/spark
 The Spark UI Proxy is taken from https://github.com/aseigneurin/spark-ui-proxy.
 The PySpark examples are taken from http://stackoverflow.com/questions/4114167/checking-if-a-number-is-a-prime-number-in-python/27946768#27946768
 ## Step Zero: Prerequisites
 This example assumes
 - You have a Kubernetes cluster installed and running.
 - That you have installed the ```kubectl``` command line tool installed in your path and configured to talk to your Kubernetes cluster
 - That your Kubernetes cluster is running [kube-dns](https://github.com/kubernetes/dns) or an equivalent integration.
 Optionally, your Kubernetes cluster should be configured with a Loadbalancer integration (automatically configured via kube-up or GKE)
 ## Step One: Create namespace
 ```sh
 $ kubectl create -f examples/spark/namespace-spark-cluster.yaml
 ```
 Now list all namespaces:
 ```sh
 $ kubectl get namespaces
 NAME          LABELS             STATUS
 default       <none>             Active
 spark-cluster name=spark-cluster Active
 ```
 To configure kubectl to work with our namespace, we will create a new context using our current context as a base:
 ```sh
 $ CURRENT_CONTEXT=$(kubectl config view -o jsonpath='{.current-context}')
 $ USER_NAME=$(kubectl config view -o jsonpath='{.contexts[?(@.name == "'"${CURRENT_CONTEXT}"'")].context.user}')
 $ CLUSTER_NAME=$(kubectl config view -o jsonpath='{.contexts[?(@.name == "'"${CURRENT_CONTEXT}"'")].context.cluster}')
 $ kubectl config set-context spark --namespace=spark-cluster --cluster=${CLUSTER_NAME} --user=${USER_NAME}
 $ kubectl config use-context spark
 ```
 ## Step Two: Start your Master service
 The Master [service](../../docs/user-guide/services.md) is the master service
 for a Spark cluster.
 Use the
 [`examples/spark/spark-master-controller.yaml`](spark-master-controller.yaml)
 file to create a
 [replication controller](../../docs/user-guide/replication-controller.md)
 running the Spark Master service.
 ```console
 $ kubectl create -f examples/spark/spark-master-controller.yaml
 replicationcontroller "spark-master-controller" created
 ```
 Then, use the
 [`examples/spark/spark-master-service.yaml`](spark-master-service.yaml) file to
 create a logical service endpoint that Spark workers can use to access the
 Master pod:
 ```console
 $ kubectl create -f examples/spark/spark-master-service.yaml
 service "spark-master" created
 ```
 ### Check to see if Master is running and accessible
 ```console
 $ kubectl get pods
 NAME                            READY     STATUS    RESTARTS   AGE
 spark-master-controller-5u0q5   1/1       Running   0          8m
 ```
 Check logs to see the status of the master. (Use the pod retrieved from the previous output.)
 ```sh
 $ kubectl logs spark-master-controller-5u0q5
 starting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.1-bin-hadoop2.6/sbin/../logs/spark--org.apache.spark.deploy.master.Master-1-spark-master-controller-g0oao.out
 Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /opt/spark-1.5.1-bin-hadoop2.6/sbin/../conf/:/opt/spark-1.5.1-bin-hadoop2.6/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/opt/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar -Xms1g -Xmx1g org.apache.spark.deploy.master.Master --ip spark-master --port 7077 --webui-port 8080
 ========================================
 15/10/27 21:25:05 INFO Master: Registered signal handlers for [TERM, HUP, INT]
 15/10/27 21:25:05 INFO SecurityManager: Changing view acls to: root
 15/10/27 21:25:05 INFO SecurityManager: Changing modify acls to: root
 15/10/27 21:25:05 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
 15/10/27 21:25:06 INFO Slf4jLogger: Slf4jLogger started
 15/10/27 21:25:06 INFO Remoting: Starting remoting
 15/10/27 21:25:06 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster@spark-master:7077]
 15/10/27 21:25:06 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
 15/10/27 21:25:07 INFO Master: Starting Spark master at spark://spark-master:7077
 15/10/27 21:25:07 INFO Master: Running Spark version 1.5.1
 15/10/27 21:25:07 INFO Utils: Successfully started service 'MasterUI' on port 8080.
 15/10/27 21:25:07 INFO MasterWebUI: Started MasterWebUI at http://spark-master:8080
 15/10/27 21:25:07 INFO Utils: Successfully started service on port 6066.
 15/10/27 21:25:07 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066
 15/10/27 21:25:07 INFO Master: I have been elected leader! New state: ALIVE
 ```
 Once the master is started, we'll want to check the Spark WebUI. In order to access the Spark WebUI, we will deploy a [specialized proxy](https://github.com/aseigneurin/spark-ui-proxy). This proxy is neccessary to access worker logs from the Spark UI.
 Deploy the proxy controller with [`examples/spark/spark-ui-proxy-controller.yaml`](spark-ui-proxy-controller.yaml):
 ```console
 $ kubectl create -f examples/spark/spark-ui-proxy-controller.yaml
 replicationcontroller "spark-ui-proxy-controller" created
 ```
 We'll also need a corresponding Loadbalanced service for our Spark Proxy [`examples/spark/spark-ui-proxy-service.yaml`](spark-ui-proxy-service.yaml):
 ```console
 $ kubectl create -f examples/spark/spark-ui-proxy-service.yaml
 service "spark-ui-proxy" created
 ```
 After creating the service, you should eventually get a loadbalanced endpoint:
 ```console
 $ kubectl get svc spark-ui-proxy -o wide
 NAME             CLUSTER-IP    EXTERNAL-IP                                                              PORT(S)   AGE       SELECTOR
 spark-ui-proxy   10.0.51.107   aad59283284d611e6839606c214502b5-833417581.us-east-1.elb.amazonaws.com   80/TCP    9m        component=spark-ui-proxy
 ```
 The Spark UI in the above example output will be available at http://aad59283284d611e6839606c214502b5-833417581.us-east-1.elb.amazonaws.com
 If your Kubernetes cluster is not equipped with a Loadbalancer integration, you will need to use the [kubectl proxy](../../docs/user-guide/accessing-the-cluster.md#using-kubectl-proxy) to
 connect to the Spark WebUI:
 ```console
 kubectl proxy --port=8001
 ```
 At which point the UI will be available at
 [http://localhost:8001/api/v1/proxy/namespaces/spark-cluster/services/spark-master:8080/](http://localhost:8001/api/v1/proxy/namespaces/spark-cluster/services/spark-master:8080/).
 ## Step Three: Start your Spark workers
 The Spark workers do the heavy lifting in a Spark cluster. They
 provide execution resources and data cache capabilities for your
 program.
 The Spark workers need the Master service to be running.
 Use the [`examples/spark/spark-worker-controller.yaml`](spark-worker-controller.yaml) file to create a
 [replication controller](../../docs/user-guide/replication-controller.md) that manages the worker pods.
 ```console
 $ kubectl create -f examples/spark/spark-worker-controller.yaml
 replicationcontroller "spark-worker-controller" created
 ```
 ### Check to see if the workers are running
 If you launched the Spark WebUI, your workers should just appear in the UI when
 they're ready. (It may take a little bit to pull the images and launch the
 pods.) You can also interrogate the status in the following way:
 ```console
 $ kubectl get pods
 NAME                            READY     STATUS    RESTARTS   AGE
 spark-master-controller-5u0q5   1/1       Running   0          25m
 spark-worker-controller-e8otp   1/1       Running   0          6m
 spark-worker-controller-fiivl   1/1       Running   0          6m
 spark-worker-controller-ytc7o   1/1       Running   0          6m
 $ kubectl logs spark-master-controller-5u0q5
 [...]
 15/10/26 18:20:14 INFO Master: Registering worker 10.244.1.13:53567 with 2 cores, 6.3 GB RAM
 15/10/26 18:20:14 INFO Master: Registering worker 10.244.2.7:46195 with 2 cores, 6.3 GB RAM
 15/10/26 18:20:14 INFO Master: Registering worker 10.244.3.8:39926 with 2 cores, 6.3 GB RAM
 ```
 ## Step Four: Start the Zeppelin UI to launch jobs on your Spark cluster
 The Zeppelin UI pod can be used to launch jobs into the Spark cluster either via
 a web notebook frontend or the traditional Spark command line. See
 [Zeppelin](https://zeppelin.incubator.apache.org/) and
 [Spark architecture](https://spark.apache.org/docs/latest/cluster-overview.html)
 for more details.
 Deploy Zeppelin:
 ```console
 $ kubectl create -f examples/spark/zeppelin-controller.yaml
 replicationcontroller "zeppelin-controller" created
 ```
 And the corresponding service:
 ```console
 $ kubectl create -f examples/spark/zeppelin-service.yaml
 service "zeppelin" created
 ```
 Zeppelin needs the spark-master service to be running.
 ### Check to see if Zeppelin is running
 ```console
 $ kubectl get pods -l component=zeppelin
 NAME                        READY     STATUS    RESTARTS   AGE
 zeppelin-controller-ja09s   1/1       Running   0          53s
 ```
 ## Step Five: Do something with the cluster
 Now you have two choices, depending on your predilections. You can do something
 graphical with the Spark cluster, or you can stay in the CLI.
 For both choices, we will be working with this Python snippet:
 ```python
 from math import sqrt; from itertools import count, islice
 def isprime(n):
    return n > 1 and all(n%i for i in islice(count(2), int(sqrt(n)-1)))
 nums = sc.parallelize(xrange(10000000))
 print nums.filter(isprime).count()
 ```
 ### Do something fast with pyspark!
 Simply copy and paste the python snippet into pyspark from within the zeppelin pod:
 ```console
 $ kubectl exec zeppelin-controller-ja09s -it pyspark
 Python 2.7.9 (default, Mar  1 2015, 12:57:24)
 [GCC 4.9.2] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.5.1
      /_/
 Using Python version 2.7.9 (default, Mar  1 2015 12:57:24)
 SparkContext available as sc, HiveContext available as sqlContext.
 >>> from math import sqrt; from itertools import count, islice
 >>>
 >>> def isprime(n):
 ...     return n > 1 and all(n%i for i in islice(count(2), int(sqrt(n)-1)))
 ...
 >>> nums = sc.parallelize(xrange(10000000))
 >>> print nums.filter(isprime).count()
 664579
 ```
 Congratulations, you now know how many prime numbers there are within the first 10 million numbers!
 ### Do something graphical and shiny!
 Creating the Zeppelin service should have yielded you a Loadbalancer endpoint:
 ```console
 $ kubectl get svc zeppelin -o wide
 NAME       CLUSTER-IP   EXTERNAL-IP                                                              PORT(S)   AGE       SELECTOR
 zeppelin   10.0.154.1   a596f143884da11e6839506c114532b5-121893930.us-east-1.elb.amazonaws.com   80/TCP    3m        component=zeppelin
 ```
 If your Kubernetes cluster does not have a Loadbalancer integration, then we will have to use port forwarding.
 Take the Zeppelin pod from before and port-forward the WebUI port:
 ```console
 $ kubectl port-forward zeppelin-controller-ja09s 8080:8080
 ```
 This forwards `localhost` 8080 to container port 8080. You can then find
 Zeppelin at [http://localhost:8080/](http://localhost:8080/).
 Once you've loaded up the Zeppelin UI, create a "New Notebook". In there we will paste our python snippet, but we need to add a `%pyspark` hint for Zeppelin to understand it:
 ```
 %pyspark
 from math import sqrt; from itertools import count, islice
 def isprime(n):
    return n > 1 and all(n%i for i in islice(count(2), int(sqrt(n)-1)))
 nums = sc.parallelize(xrange(10000000))
 print nums.filter(isprime).count()
 ```
 After pasting in our code, press shift+enter or click the play icon to the right of our snippet. The Spark job will run and once again we'll have our result!
 ## Result
 You now have services and replication controllers for the Spark master, Spark
 workers and Spark driver.  You can take this example to the next step and start
 using the Apache Spark cluster you just created, see
 [Spark documentation](https://spark.apache.org/documentation.html) for more
 information.
 ## tl;dr
 ```console
 kubectl create -f examples/spark
 ```
 After it's setup:
 ```console
 kubectl get pods # Make sure everything is running
 kubectl get svc -o wide # Get the Loadbalancer endpoints for spark-ui-proxy and zeppelin
 ```
 At which point the Master UI and Zeppelin will be available at the URLs under the `EXTERNAL-IP` field.
 You can also interact with the Spark cluster using the traditional `spark-shell` /
 `spark-subsubmit` / `pyspark` commands by using `kubectl exec` against the
 `zeppelin-controller` pod.
 If your Kubernetes cluster does not have a Loadbalancer integration, use `kubectl proxy` and `kubectl port-forward` to access the Spark UI and Zeppelin.
 For Spark UI:
 ```console
 kubectl proxy --port=8001
 ```
 Then visit [http://localhost:8001/api/v1/proxy/namespaces/spark-cluster/services/spark-ui-proxy/](http://localhost:8001/api/v1/proxy/namespaces/spark-cluster/services/spark-ui-proxy/).
 For Zeppelin:
 ```console
 kubectl port-forward zeppelin-controller-abc123 8080:8080 &
 ```
 Then visit [http://localhost:8080/](http://localhost:8080/).
 ## Known Issues With Spark
 * This provides a Spark configuration that is restricted to the cluster network,
  meaning the Spark master is only available as a cluster service. If you need
  to submit jobs using external client other than Zeppelin or `spark-submit` on
  the `zeppelin` pod, you will need to provide a way for your clients to get to
  the
  [`examples/spark/spark-master-service.yaml`](spark-master-service.yaml). See
  [Services](../../docs/user-guide/services.md) for more information.
 ## Known Issues With Zeppelin
 * The Zeppelin pod is large, so it may take a while to pull depending on your
  network. The size of the Zeppelin pod is something we're working on, see issue #17231.
 * Zeppelin may take some time (about a minute) on this pipeline the first time
  you run it. It seems to take considerable time to load.
 * On GKE, `kubectl port-forward` may not be stable over long periods of time. If
  you see Zeppelin go into `Disconnected` state (there will be a red dot on the
  top right as well), the `port-forward` probably failed and needs to be
  restarted. See #12179.
 <!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
 [![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/examples/spark/README.md?pixel)]()
 <!-- END MUNGE: GENERATED_ANALYTICS -->
--- a/manifests/spark-standalone/namespace-spark-cluster.yaml
+++ b/manifests/spark-standalone/namespace-spark-cluster.yaml
@ -0,0 +1,6 @@
 apiVersion: v1
 kind: Namespace
 metadata:
  name: "spark-cluster"
  labels:
    name: "spark-cluster"
--- a/manifests/spark-standalone/spark-ingress.yaml
+++ b/manifests/spark-standalone/spark-ingress.yaml
@ -0,0 +1,21 @@
 apiVersion: extensions/v1beta1
 kind: Ingress
 metadata:
  name: traefik-ingress
  namespace: spark-cluster
 spec:
  rules:
    - host: spark.traefik.io
      http:
        paths:
        - path: /
          backend:
            serviceName: spark-ui-proxy
            servicePort: 80
    - host: zeppelin.traefik.io
      http:
        paths:
        - path: /
          backend:
            serviceName: zeppelin
            servicePort: 80
--- a/manifests/spark-standalone/spark-master-controller.yaml
+++ b/manifests/spark-standalone/spark-master-controller.yaml
@ -0,0 +1,24 @@
 kind: ReplicationController
 apiVersion: v1
 metadata:
  name: spark-master-controller
  namespace: spark-cluster
 spec:
  replicas: 1
  selector:
    component: spark-master
  template:
    metadata:
      labels:
        component: spark-master
    spec:
      containers:
        - name: spark-master
          image: sz-pg-oam-docker-hub-001.tendcloud.com/library/spark:1.5.2_v1
          command: ["/start-master"]
          ports:
            - containerPort: 7077
            - containerPort: 8080
          resources:
            requests:
              cpu: 100m
--- a/manifests/spark-standalone/spark-master-service.yaml
+++ b/manifests/spark-standalone/spark-master-service.yaml
@ -0,0 +1,15 @@
 kind: Service
 apiVersion: v1
 metadata:
  name: spark-master
  namespace: spark-cluster
 spec:
  ports:
    - port: 7077
      targetPort: 7077
      name: spark
    - port: 8080
      targetPort: 8080
      name: http
  selector:
    component: spark-master
--- a/manifests/spark-standalone/spark-ui-proxy-controller.yaml
+++ b/manifests/spark-standalone/spark-ui-proxy-controller.yaml
@ -0,0 +1,30 @@
 kind: ReplicationController
 apiVersion: v1
 metadata:
  name: spark-ui-proxy-controller
  namespace: spark-cluster
 spec:
  replicas: 1
  selector:
    component: spark-ui-proxy
  template:
    metadata:
      labels:
        component: spark-ui-proxy
    spec:
      containers:
        - name: spark-ui-proxy
          image: sz-pg-oam-docker-hub-001.tendcloud.com/library/spark-ui-proxy:1.0
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: 100m
          args:
            - spark-master:8080
          livenessProbe:
              httpGet:
                path: /
                port: 80
              initialDelaySeconds: 120
              timeoutSeconds: 5
--- a/manifests/spark-standalone/spark-ui-proxy-service.yaml
+++ b/manifests/spark-standalone/spark-ui-proxy-service.yaml
@ -0,0 +1,12 @@
 kind: Service
 apiVersion: v1
 metadata:
  name: spark-ui-proxy
  namespace: spark-cluster
 spec:
  ports:
    - port: 80
      targetPort: 80
  selector:
    component: spark-ui-proxy
  type: ClusterIP
--- a/manifests/spark-standalone/spark-worker-controller.yaml
+++ b/manifests/spark-standalone/spark-worker-controller.yaml
@ -0,0 +1,24 @@
 kind: ReplicationController
 apiVersion: v1
 metadata:
  name: spark-worker-controller
  namespace: spark-cluster
 spec:
  replicas: 3
  selector:
    component: spark-worker
  template:
    metadata:
      labels:
        component: spark-worker
    spec:
      containers:
        - name: spark-worker
          image: sz-pg-oam-docker-hub-001.tendcloud.com/library/spark:1.5.2_v1
          command: ["/start-worker"]
          ports:
            - containerPort: 8081
          resources:
            requests:
              cpu: 100m
--- a/manifests/spark-standalone/zeppelin-controller.yaml
+++ b/manifests/spark-standalone/zeppelin-controller.yaml
@ -0,0 +1,22 @@
 kind: ReplicationController
 apiVersion: v1
 metadata:
  name: zeppelin-controller
  namespace: spark-cluster
 spec:
  replicas: 1
  selector:
    component: zeppelin
  template:
    metadata:
      labels:
        component: zeppelin
    spec:
      containers:
        - name: zeppelin
          image: sz-pg-oam-docker-hub-001.tendcloud.com/library/zeppelin:0.7.1
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 100m
--- a/manifests/spark-standalone/zeppelin-service.yaml
+++ b/manifests/spark-standalone/zeppelin-service.yaml
@ -0,0 +1,12 @@
 kind: Service
 apiVersion: v1
 metadata:
  name: zeppelin
  namespace: spark-cluster
 spec:
  ports:
    - port: 80
      targetPort: 8080
  selector:
    component: zeppelin
  type: ClusterIP
--- a/usecases/big-data.md
+++ b/usecases/big-data.md
@ -4,4 +4,4 @@ Kubernetes community中已经有了一个[Big data SIG](https://github.com/kuber
 其实在Swarm、Mesos、kubernetes这三种流行的容器编排调度架构中，Mesos对于大数据应用支持是最好的，spark原生就是运行在mesos上的，当然也可以容器化运行在kubernetes上。
-[Spark on Kubernetes](spark-on-kubernetes.md)
+[Spark standalone on Kubernetes](spark-standalone-on-kubernetes.md)
--- a/usecases/spark-standalone-on-kubernetes.md
+++ b/usecases/spark-standalone-on-kubernetes.md
@ -1,10 +1,8 @@
-# Spark on Kubernetes
+# Spark standalone on Kubernetes
-时速云上提供的镜像docker pull index.tenxcloud.com/google_containers/spark:1.5.2_v1都下载不下来。
+该项目是基于 Spark standalone 模式，对资源的分配调度还有作业状态查询的功能实在有限，对于让 spark 使用真正原生的 kubernetes 资源调度推荐大家尝试 https://github.com/apache-spark-on-k8s/
-因此我自己编译的spark的镜像。
+本文中使用的镜像我已编译好上传到了时速云上，大家可以直接下载。
 编译好后上传到了时速云镜像仓库
 ```
 index.tenxcloud.com/jimmy/spark:1.5.2_v1
@ -13,6 +11,10 @@ index.tenxcloud.com/jimmy/zeppelin:0.7.1
 代码和使用文档见Github地址：https://github.com/rootsongjc/spark-on-kubernetes
 本文中用到的 yaml 文件可以在 [../manifests/spark-standalone](../manifests/spark-standalone) 目录下找到，也可以在上面的 https://github.com/rootsongjc/spark-on-kubernetes/ 项目的 manifests 目录下找到。
 **注意**：时速云上本来已经提供的镜像 `index.tenxcloud.com/google_containers/spark:1.5.2_v1` ，但是该镜像似乎有问题，下载总是失败。
 ## 在Kubernetes上启动spark
 创建名为spark-cluster的namespace，所有操作都在该namespace中进行。
`@ -4,4 +4,4 @@ Kubernetes community中已经有了一个[Big data SIG](https://github.com/kuber`

	`其实在Swarm、Mesos、kubernetes这三种流行的容器编排调度架构中，Mesos对于大数据应用支持是最好的，spark原生就是运行在mesos上的，当然也可以容器化运行在kubernetes上。`	`其实在Swarm、Mesos、kubernetes这三种流行的容器编排调度架构中，Mesos对于大数据应用支持是最好的，spark原生就是运行在mesos上的，当然也可以容器化运行在kubernetes上。`

	`[Spark on Kubernetes](spark-on-kubernetes.md)`	`[Spark standalone on Kubernetes](spark-standalone-on-kubernetes.md)`