azkaban3.x学习(一)安装部署
背景
公司最近需要搭建一个工作流任务调度系统,最终选择开源框架azkaban,目前只是学习分享,如有误解偏差,欢迎指正;
azkaban官网这样描述:
Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs. Azkaban resolves the ordering through job dependencies and provides an easy to use web user interface to maintain and track your workflows.
以及特性:
- Compatible with any version of Hadoop
- Easy to use web UI
- Simple web and http workflow uploads
- Project workspaces
- Scheduling of workflows
- Modular and pluginable
- Authentication and Authorization
- Tracking of user actions
- Email alerts on failure and successes
- SLA alerting and auto killing
- Retrying of failed jobs
基本可以满足Hadoop体系的工作流任务调度。
安装
azkaban3.0以上版本,支持两种部署方式:独立服务器(alone “solo-server”)模式以及分布式多执行器(distributed multiple-executor)模式。
独立服务器模式可以用于小规模的用例,使用嵌入H