Ray K8s 集群部署

Github_Macaron

已于 2023-12-27 14:01:50 修改

阅读量6.3k

点赞数 23

CC 4.0 BY-SA版权

分类专栏： Ray 分布式计算系统框架文章标签： kubernetes docker 容器

于 2023-05-23 13:33:15 首次发布

本文链接：https://blog.youkuaiyun.com/Github_Lee/article/details/130825190

Ray 分布式计算系统框架专栏收录该内容

1 篇文章

订阅专栏

Ray K8s集群部署

1. Introduction

A ray cluster is a set of worker nodes connected to a common Ray head node.
- 一个 Ray 集群是与同一个主节点相连接的工作节点。Ray集群可以保持固定大小，也可以自动伸缩
Ray集群支持通过 Kuberay 项目搭建在 k8s 集群上面
Key Concept
- Ray Cluster
  - 一个ray集群包括一个主节点，以及任意数量的工作节点
  - 在k8s集群上，一个 ray node 相当于一个 pod
  - 用户可以通过提交 Jobs 的方式在Ray集群上执行任务，或者通过 ray.init 的方式与主节点连接并进行交互式响应
- Head Node
  - 主节点除了还独立运行了 Autoscaler 与 Ray Driver Process 进程以外，与其他工作节点没有区别
- Worker Node
  - 工作节点不运行任何类似主节点似的管理进程，只用来执行用户 task、actor等任务的代码
  - 它们也参与分布式调度，同时存储与分发Ray Objects在集群内存中
- Autoscaling
  - 它是一个运行在主节点的独立进程，或是在k8s head pod中的一个 sidecar container
  - 负责根据某些条件增加或减少集群中工作节点的数量
- Ray Jobs
  - A Ray job is a single application: it is the collection of Ray tasks, objects, and actors that originate from the same script. The worker that runs the Python script is known as the driver of the job.
  - 一个Ray Job 类似于一个单独的应用程序：它是来自同一个项目代码中的 Ray tasks、objects、actors的集合。最开始执行这个项目代码的 worker process 也被标记为 driver process（驱动进程）
  - 在Ray 集群中执行一个Ray Job 的方法有3种：
    - 1）推荐的方式：通过 Ray Jobs API 的方式向集群提交 Job
    - 2）直接在Ray集群中的其中一个节点上直接运行
    - 3）For experts only：使用 Ray Client 以及一份 Ray Script 远程连接Ray集群

2. Ray on Kubernetes

2.1 Get Started

使用Kuberay Operator是官方推荐的方式，operator 提供了一种k8s原生方式的管理ray集群的方法
在这个部分，我们可以学习到：
- 1）如何在k8s集群上安装以及配置 Ray 集群
- 2）部署以及监控 Ray 应用
- 3）Integrate Ray applications with Kubernetes networking

1）安装以及部署

注意事项：我之前部署Ray 的时候有可能用的就是较早版本的 Kuberay operator
1）前置工作：在本地机器上安装对应版本的 Ray release，安装kubectl，并能访问到k8s集群
- pip install -U “ray[default]” ，目的是为了能够通过 Ray Job 提交的方式与 Ray集群进行交互
- 安装 kubectl，为了能够与 k8s集群进行交互
- 安装或配置k8s集群：为了后续部署 ray集群
- 注意：本地 ray library 最好与最后集群中要用的 ray 版本一致
2）首先部署 kuberay operator：一个用来创建以及管理ray集群的特殊pod
- ~~a）根据配置文件部署 kuberay operator~~ 请查看下面的通过 Helm方式的部署
  - ```
  # 注意 create 不能用 apply 代替，issue问题已提出
  # 也可以通过下载 kuberay git 仓库，找到 kuberay operator 的配置文件来部署
  # default 文件夹下的 Namespace.yaml 与 kustomization 文件可以自定义 K8s 中 命名空间的名称，默认为 ray-system
  ! kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.4.0&timeout=90s"
  
  # 检查 kuberay-operator 已经成功运行
  ! kubectl -n ray-system get pod --selector=app.kubernetes.io/component=kuberay-operator
```
- a）使用 Helm 部署
  - Helm简介：一款开源的k8s包管理器，类似于 apt、yum、homebrew等。使用helm可以从 Chart 仓库中快速查找、下载安装软件包并通过k8s api server交互构建应用
  - 基础概念：
    - Chart：一个chart就是一个 helm package，它包括了在k8s集群上运行一个应用、工具、服务所需要的全部资源定义，类似于 homebrew formula、apt dpkg、yum rpm file等
    - Repository：仓库，用于存储和分享charts
    - Release：一个release就是一个成功在k8s集群上创建成功的 chart 实例，有点类似于镜像-容器之间的类比关系，
    - 总结：helm 在k8s集群上下载 charts后，通过 release 的方式可以多次部署安装在k8s集群中，并且 charts 可以通过 repository仓库进行存储与分享
  - 部署 kuberay operator，我使用到的是 version 1.0.0版本的
```
$ helm repo add kuberay https://ray-project.github.io/kuberay-helm/
$ helm repo update

# 记得先创建好自己所需要的命名空间：namespace
$ helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0 --namespace ray-system
```

3）部署 Ray 集群

1）kuberay operator已成功执行，现在可以通过它来部署一个简单的 ray cluster

# Deploy a sample Ray Cluster CR from the KubeRay repo:
# 这个配置文件同样可以在 kuberay git 仓库中找到
$ kubectl -n ray-system apply -f https://raw.githubusercontent.com/ray-project/kuberay/master/ray-operator/config/samples/ray-cluster.autoscaler.yaml

# This Ray cluster is named `raycluster-autoscaler` because it has optional Ray Autoscaler support enabled.

2）自定义YAML配置文件部署ray集群

# 下载 kuberay 对应 git 仓库
$ git clone https://github.com/ray-project/kuberay.git
# checkout 到你想使用的 kuberay release 版本，或者不切换使用最新版本，这里我切换到 release1.0
$ cd ./kuberay
$ git checkout dcb5d3d
# 切换到官方给出的 sample YAML配置文件夹
$ cd ./ray-operator/config/samples/

# 通过kubectl apply 方式部署，可以自定义 YAML 配置文件的配置项
$ kubectl apply -f ray-cluster.autoscaler.yaml --namespace ray-system

# 等待一段时间，查看部署状态
$ kubectl get pods -o wide -n ray-system
$ kubectl get svc -o wide -n ray-system

4）在Ray集群上运行 Ray应用：

a）直接使用 kubect exec 命令在 pod 内执行

# Substitute your output from the last cell in place of "raycluster-autoscaler-head-xxxxx"

$ kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -n ray-system -- python -c "import ray; ray.init()"
# 2022-08-10 11:23:17,093 INFO worker.py:1312 -- Connecting to existing Ray cluster at address: <IP address>:6379...
# 2022-08-10 11:23:17,097 INFO worker.py:1490 -- Connected to Ray cluster.

ray.init() 初始化包含的环节以后详细分析

b）通过 Ray Job Submission的方式

# 首先对主节点的端口进行转发，端口转发时记得不要退出
# Execute this in a separate shell. 
$ kubectl port-forward service/raycluster-autoscaler-head-svc 8265:8265 -n ray-system

# 然后在本地拥有 ray release 的地方提交 ray 代码
# The following job's logs will show the Ray cluster's total resource capacity, including 3 CPUs.

$ ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
# 或者提交你自己的本地 ray_example.py 脚本
$ ray job submit --address http://localhost:8265 --working-dir . -- python remote_function.py

c）连接到 ray head 或 worker 节点内部，查看节点内部情况

# 记得 xxxx 处改为你自己的 head pod实际名称对应
# worker pod一样，只是worker pod中只用一个container，因此不需要再指定container了，去掉 --container ray-head 即可
$ sudo kubectl exec -it raycluster-autoscaler-head-xxxxx --container ray-head -n ray-system -- /bin/bash

5）清理工作：卸载Ray集群、kubera operator以及相关资源

清理工作如下：

# 先清理 Ray 集群
# Delete by reference to the RayCluster custom resource
$ kubectl delete raycluster raycluster-autoscaler -n ray-system
# 或者使用 创建Ray集群时所使用的 YAML 文件清楚
$ kubectl delete -f samples/xxx.yaml -n ray-system

# 然后清理删除 kuberay operator
# 使用 helm uninstall 即可
$ helm uninstall [your-release-name]

2.2 User Guides

2.2.1 Managed K8s services

这一步是教如何在谷歌云、亚马逊云、微软云等平台上快速建立k8s集群

2.2.2 RayCluster Configuration

这部分指导讲解了在k8s上部署 ray 集群包含的
Pod configuration: headGroupSpec and workerGroupSpecs
- Pod templates
  - resources
    - 注意：推荐使用更少但资源更大的Ray Pods，而不是更多但更小的 Ray Pods，优势在于：
      - more efficient use of each Ray pod’s shared memory object store
      - reduced communication overhead between Ray pods
      - reduced redundancy of per-pod Ray control structures such as Raylets
    - ？如果用于Serverless领域，如何解决 Ray Pods 扩容速度的问题？

2.2.3 Kuberay Autoscaling

pod 扩容过程：
$[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2dY6UlEc-1684819698231)(C:\Users\14112\AppData\Roaming\Typora\typora-user-images\image-20230404105020184.png)]$
- Worker pod upscaling occurs through the following sequence of events:
  1. The user submits a Ray workload.
  2. Workload resource requirements are aggregated by the Ray head container and communicated to the Ray autoscaler sidecar.
  3. The autoscaler determines that a Ray worker pod must be added to satisfy the workload’s resource requirement.
  4. The autoscaler requests an addtional worker pod by incrementing the RayCluster CR’s replicas field.
  5. The KubeRay operator creates a Ray worker pod to match the new replicas specification.
  6. The Ray scheduler places the user’s workload on the new worker pod.
建议：
- 在实际应用中，Ray开发者建议将 Ray Pod资源大小设置为与 k8s node 一样大为好
- ？但是这样Ray Pod是否导致k8s资源使用率较低？是否会有安全性、隔离性的问题？
扩缩容速度：
- 可以通过 RayCluster CR 中的 autoscalerOptions field来控制扩缩容的速度，它包含有两个子fields
  - upscalingMode：
    - Conservative： upscaling是 rate-limited 的，the number of pending worker pods is at most the number of worker pods connected to the Ray cluster
    - Default： upscaling 不是 rate-limited 的
    - Aggressive： Default 的别名， upscaling 不是 rate-limited 的
  - idleTimeoutSeconds：default 60s，60s 没有任何 tasks、actors、objects任务就可以缩容
Ray Autoscaler vs. Horizontal Pod Autoscaler
- RAY 自带的伸缩器与 K8s提供的 HPA是有一定的区别的
Ray Autoscaler 与 K8s Cluster Autoscaler的关系
与 Vertical Pod Autoscaler 的关系 VPA

2.2.4 Logging

暂时看到这里

2.2.5 Using GPUs

2.2.6 Experimental Features

2.2.7 (Advanced) Deploying a static Ray Cluster without Kuberay

2.3 Examples

2.3.1 Ray AIR XGBoostTrainer on K8s

2.3.2 ML training with GPUs on k8s

3. Ray Job

可以在本地通过 Ray job submit 的方式将本地的 scripts 项目提交至 Ray 集群中执行
```
$ ray job submit --address http://localhost:8265 --working-dir your_working_directory -- python script.py
```
- This command will run the script on the Ray Cluster and wait until the job has finished. Note that it also streams the stdout of the job back to the client (hello world in this case). Ray will also make the contents of the directory passed as --working-dir available to the Ray job by downloading the directory to all nodes in your cluster.
Warning:
- When using the Ray Jobs API, the runtime environment should be specified only in the Jobs API (e.g. in ray job submit --runtime-env=... or JobSubmissionClient.submit_job(runtime_env=...)), not via ray.init(runtime_env=...) in the driver script.