1.概念
Elastic-job 是由当当网基于quartz 二次开发之后的分布式调度解决方案 , 由两个相对独立的子项目Elastic-Job-Lite和Elastic-Job-Cloud组成 。
Elastic-job主要的设计理念是无中心化的分布式定时调度框架,思路来源于Quartz的基于数据库的高可用方案。但数据库没有分布式协调功能,所以在高可用方案的基础上增加了弹性扩容和数据分片的思路,以便于更大限度的利用分布式服务器的资源。
2.基于SpringBoot 2.x的使用方法
2.1.前言
Elastic-job的分类以及分片等原理请各位自行查阅资料,这里不再阐述,我在这里直讲在项目中的具体使用方法。
2.2.重要参数
1.基于ZooKeeper作业注册中心ZookeeperConfiguration
serverLists:连接Zookeeper服务器的列表,Zookeeper集群地址
namespace:Zookeeper的命名空间,保证唯一
2.作业配置
jobName:定时任务名称
cron:定时任务cron表达式
shardingTotalCount:分片总数配置
shardingItemParameters:分片参数配置,分片序列号和参数用等号分隔,多个键值对用逗号分隔
分片序列号从0开始,不可大于或等于作业分片总数,如:0=a,1=b,2=c,等号左边是当前分片标识,左边是自定义分片参数,配合shardingTotalCount一起使用
failover:是否开启任务执行失效转移,开启表示如果作业在一次任务执行中途宕机,允许将该次未完成的任务在另一作业节点上补偿执行,默认false
2.3.常用Job
配置:
elastic:
job:
testJob:
failover: 'false'
shardingTotalCount: 4
jobName: testJob
cron: '0 0 0 1 * ? '
shardingItemParameters: 0=0
zookeeper:
server-lists: 'IP:端口,IP:端口,IP:端口....'
sessiontimeout: 30000
namespace: test-job
依赖:
<dependency>
<groupId>com.dangdang</groupId>
<artifactId>elastic-job-lite-spring</artifactId>
</dependency>
<dependency>
<groupId>com.dangdang</groupId>
<artifactId>elastic-job-lite-core</artifactId>
<exclusions>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
</exclusion>
</exclusions>
</dependency>
//下面这些包是为了解决包冲突,请各位按需配置
<dependency>
<groupId>org.apache.geronimo.components</groupId>
<artifactId>geronimo-jaspi</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.12.0</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
<version>2.12.0</version>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>2.12.0</version>
<exclusions>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>
//下面这些包是Job的事件追踪所需要的,加上这些包需要配置数据库(当前项目数据源)以及几张表,job_execution_log和job_status_trace_log,表结构下面给出
<!--<dependency>
<groupId>com.dangdang</groupId>
<artifactId>elastic-job-lite-lifecycle</artifactId>
<version>2.1.5</version>
<exclusions>
<exclusion>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.activation</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-impl</artifactId>
</exclusion>
<exclusion>
<artifactId>javax.servlet</artifactId>
<groupId>org.eclipse.jetty.orbit</groupId>
</exclusion>
<exclusion>
<artifactId>jettison</artifactId>
<groupId>org.codehaus.jettison</groupId>
</exclusion>
<exclusion>
<groupId>org.apache.geronimo.specs</groupId>
<artifactId>geronimo-jms_1.1_spec</artifactId>
</exclusion>
</exclusions>
</dependency>-->
事件追踪表结构:
DROP TABLE IF EXISTS `job_execution_log`;
CREATE TABLE `job_execution_log` (
`id` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
`job_name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`task_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`hostname` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`ip` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`sharding_item` int(11) NOT NULL,
`execution_source` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`failure_cause` varchar(4000) COLLATE utf8_unicode_ci DEFAULT NULL,
`is_success` int(11) NOT NULL,
`start_time` timestamp NULL DEFAULT NULL,
`complete_time` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
DROP TABLE IF EXISTS `job_status_trace_log`;
CREATE TABLE `job_status_trace_log` (
`id` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
`job_name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`original_task_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`task_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`slave_id` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`source` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`execution_type` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`sharding_item` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`state` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`message` varchar(4000) COLLATE utf8_unicode_ci DEFAULT NULL,
`creation_time` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `TASK_ID_STATE_INDEX` (`task_id`,`state`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
2.3.1 SimpleJob--简单作业Job,做常用的Job
初始化Zookeeper
@Configuration
public class ZookeeperAutoConfiguration {
@Value("${elaticjob.zookeeper.server-lists}")
private String serverLists;
@Value("${elaticjob.zookeeper.namespace}")
private String namespace;
@Value("${elaticjob.zookeeper.sessiontimeout}")
private Integer zkSessionTimeout;
@Bean(initMethod = "init")
public ZookeeperRegistryCenter zookeeperRegistryCenter(){
ZookeeperConfiguration config = new ZookeeperConfiguration(serverLists, namespace);
config.setSessionTimeoutMilliseconds(zkSessionTimeout);
ZookeeperRegistryCenter regCenter = new ZookeeperRegistryCenter(config);
return regCenter;
}
}
作业异常处理
@Slf4j
public class JobExceptionHandler implements JobExceptionHandler {
@Override
public void handleException(String jobName, Throwable cause) {
log.error("任务名称 :{} 执行失败,原因 : {}",jobName,cause);
}
}
简单作业抽象类
public abstract class AbstractSimpleJob implements SimpleJob, CommandLineRunner {
@Resource
protected ZookeeperRegistryCenter regCenter;
protected void registerSimpleJob(String jobName, String cron, int shardingTotalCount,
String shardingItemParameters, boolean failover) {
new SpringJobScheduler(this, regCenter, getLiteJobConfiguration(this.getClass(), jobName, cron,
shardingTotalCount, shardingItemParameters, failover)).init();
}
/**
* @param jobClass 实现SimpleJob接口的实例
* @param jobName 定时任务名称
* @param cron 任务启动时间 格式 "0/20 * * * * ?"
* @param shardingTotalCount 分片数
* @param shardingItemParameters 任务参数 ,例子:0=A,1=B
* @return
*/
private static LiteJobConfiguration getLiteJobConfiguration(Class<? extends SimpleJob> jobClass, String jobName,
String cron, int shardingTotalCount, String shardingItemParameters, boolean failover) {
return LiteJobConfiguration.newBuilder(new SimpleJobConfiguration(
JobCoreConfiguration.newBuilder(jobName, cron, shardingTotalCount)
.jobProperties(JobProperties.JobPropertiesEnum.JOB_EXCEPTION_HANDLER.getKey(), JobExceptionHandler.class.getCanonicalName())
.failover(failover)
.shardingItemParameters(shardingItemParameters).jobParameter(shardingItemParameters).build(),
jobClass.getCanonicalName())).overwrite(true).build();
}
}
作业实现类
@Slf4j
@Component
public class JobTask extends AbstractSimpleJob {
@Value("${elastic.job.testJob.jobName}")
private String jobName;
@Value("${elastic.job.testJob.cron}")
private String cron;
/**
* 分片数
*/
@Value("${elastic.job.testJob.shardingTotalCount}")
private Integer shardingTotalCount;
/**
* 设置分片序列号和个性化参数对照表,可以列举为分片别名,debug时方便
*/
@Value("${elastic.job.testJob.shardingItemParameters}")
private String shardingItemParameters;
/**
* 是否开启作业执行失效转移。开启表示如果作业在一次作业执行中途宕机,允许将该次未完成的作业在另一作业节点上补偿执行。
*/
@Value("${elastic.job.testJob.failover}")
private Boolean failover;
@Resource
private TestJobTaskService testJobTaskService;
@Override
public void execute(ShardingContext shardingContext) {
// 任务执行入口
log.info("测试定时任务 ------Thread ID : {},任务总片数: {},当前分片项:{},当前任务的id:{}",
//获取当前线程的id
Thread.currentThread().getId(),
//获取任务总片数
shardingContext.getShardingTotalCount(),
//获取当前分片项
shardingContext.getShardingItem(),
//获取任务的id
shardingContext.getTaskId()
);
testJobTaskService.testJob(shardingContext);
}
@Override
public void run(String... args) {
registerSimpleJob(jobName, cron, shardingTotalCount,shardingItemParameters, failover);
}
}
至此Job可以顺利运行了,但是在我使用SimpleJob过程中发现一个问题:
当shardingTotalCount配置的数量大于当前服务集群总数时,根据ElasticJob默认的分片原则,其中一台或者几台机器会出现同时有多个线程启动Job,这个在分布式应用中很容易出现重复数据的问题,我的解决办法是重写ElasticJob默认分片策略
public class MyJobShardingStrategy implements JobShardingStrategy {
@Override
public Map<JobInstance, List<Integer>> sharding(List<JobInstance> jobInstances, String jobName, int shardingTotalCount) {
if (jobInstances.isEmpty()) {
return Collections.emptyMap();
}
//下面是我简单重写的分片策略,判断当前配置的分片总数是否大于当前服务机器总数,如果大于就按照每台机器一个线程分配,小于的话依旧按照ElasticJob默认的分片策略
Map<JobInstance, List<Integer>> result = new LinkedHashMap<>();
if(shardingTotalCount < jobInstances.size()){
result = shardingAliquot(jobInstances, shardingTotalCount);
addAliquant(jobInstances, shardingTotalCount, result);
}else{
for (int i = 0;i<jobInstances.size();i++){
result.put(jobInstances.get(i), Lists.newArrayList(i));
}
}
return result;
}
private Map<JobInstance, List<Integer>> shardingAliquot(final List<JobInstance> shardingUnits, final int shardingTotalCount) {
Map<JobInstance, List<Integer>> result = new LinkedHashMap<>(shardingTotalCount, 1);
int itemCountPerSharding = shardingTotalCount / shardingUnits.size();
int count = 0;
for (JobInstance each : shardingUnits) {
List<Integer> shardingItems = new ArrayList<>(itemCountPerSharding + 1);
for (int i = count * itemCountPerSharding; i < (count + 1) * itemCountPerSharding; i++) {
shardingItems.add(i);
}
result.put(each, shardingItems);
count++;
}
return result;
}
private void addAliquant(final List<JobInstance> shardingUnits, final int shardingTotalCount, final Map<JobInstance, List<Integer>> shardingResults) {
int aliquant = shardingTotalCount % shardingUnits.size();
int count = 0;
for (Map.Entry<JobInstance, List<Integer>> entry : shardingResults.entrySet()) {
if (count < aliquant) {
entry.getValue().add(shardingTotalCount / shardingUnits.size() * shardingUnits.size() + count);
}
count++;
}
}
}
解决上面问题还有更简单的办法,就是使用DataFlowJob的FetchData和processData在定时任务启动时对数据预处理
2.3.2 DataFlowJob
public abstract class AbstractDataFlowJob implements DataflowJob, CommandLineRunner {
@Resource
protected ZookeeperRegistryCenter regCenter;
protected void registerSimpleJob(String jobName, String cron, int shardingTotalCount,
String shardingItemParameters, boolean failover) {
new SpringJobScheduler(this, regCenter, getLiteJobConfiguration(this.getClass(), jobName, cron,
shardingTotalCount, shardingItemParameters, failover)).init();
}
/**
* @param jobClass 实现SimpleJob接口的实例
* @param jobName 定时任务名称
* @param cron 任务启动时间 格式 "0/20 * * * * ?"
* @param shardingTotalCount 分片数
* @param shardingItemParameters 任务参数 ,例子:0=A,1=B
* @return
*/
private static LiteJobConfiguration getLiteJobConfiguration(Class<? extends DataflowJob> jobClass, String jobName,
String cron, int shardingTotalCount, String shardingItemParameters, boolean failover) {
return LiteJobConfiguration.newBuilder(new DataflowJobConfiguration(
JobCoreConfiguration.newBuilder(jobName, cron, shardingTotalCount)
.jobProperties(JobProperties.JobPropertiesEnum.JOB_EXCEPTION_HANDLER.getKey(), RomeJobExceptionHandler.class.getCanonicalName())
.failover(failover)
.shardingItemParameters(shardingItemParameters).jobParameter(shardingItemParameters).build(),
jobClass.getCanonicalName(),false)).overwrite(true).build();
}
}
DataFlow实现类
@Slf4j
@Component
public class DataFlowJobTask extends AbstractDataFlowJob {
@Value("${elastic.job.cardJob.jobName}")
private String jobName;
@Value("${elastic.job.cardJob.cron}")
private String cron;
/**
* 分片数
*/
@Value("${elastic.job.cardJob.shardingTotalCount}")
private Integer shardingTotalCount;
/**
* 设置分片序列号和个性化参数对照表,可以列举为分片别名,debug时方便
*/
@Value("${elastic.job.cardJob.shardingItemParameters}")
private String shardingItemParameters;
/**
* 是否开启作业执行失效转移。开启表示如果作业在一次作业执行中途宕机,允许将该次未完成的作业在另一作业节点上补偿执行。
*/
@Value("${elastic.job.cardJob.failover}")
private Boolean failover;
@Override
public void run(String... args) {
registerSimpleJob(jobName, cron, shardingTotalCount,shardingItemParameters, failover);
}
@Override
public List fetchData(ShardingContext shardingContext) {
//获取分片参数
shardingContext.getShardingParameter();
List<T> list = //根据分片参数配置,从数据库分页取数据,
return list;
}
@Override
public void processData(ShardingContext shardingContext, List data) {
//这里是处理分页后的数据
}
}
以上是我实际项目使用过程中所遇到的问题和解决办法,仅供大家参考,有任何疑问欢迎大家评论区留言!
转载请注明出处,谢谢!

本文深入探讨Elastic-Job在SpringBoot2.x项目中的使用方法,包括配置参数详解、作业异常处理、分片策略优化及DataFlowJob的实践。针对分片数量超过集群规模的问题,提出自定义分片策略和使用DataFlowJob进行预处理的解决方案。
3万+

被折叠的 条评论
为什么被折叠?



