Why spring batch repeats steps several times if I

2019-08-20 12:33发布

Initially I had following configuration:

@Configuration
public class ParallelFlowConfig {

    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Bean
    public Tasklet tasklet() {
        return new CountingTasklet();
    }

    @Bean
    public Flow syncFlow() {
        return new FlowBuilder<Flow>("sync_flow")
                .start(stepBuilderFactory.get("sync_flow_step1")
                        .tasklet(tasklet())
                        .build())
                .next(stepBuilderFactory.get("sync_flow_step2")
                        .tasklet(tasklet())
                        //.taskExecutor(new SimpleAsyncTaskExecutor()) I am going to uncomment this line
                        .build())
                .build();
    }

    @Bean
    public Flow parallelFlow1() {
        return new FlowBuilder<Flow>("async_flow_1")
                .start(stepBuilderFactory.get("async_flow_1_step_1")
                        .tasklet(tasklet()).build())
                .build();
    }

    @Bean
    public Flow parallelFlow2() {
        return new FlowBuilder<Flow>("async_flow_2")
                .start(stepBuilderFactory.get("async_flow_2_step_1")
                        .tasklet(tasklet())
                        .build())
                .next(stepBuilderFactory.get("async_flow_2_step_2")
                        .tasklet(tasklet())
                        .build())
                .build();
    }

    @Bean
    public Flow wrapperFlow(TaskExecutor jobTaskExecutor) {
        return new FlowBuilder<Flow>("wrapperFlow")
                .start(parallelFlow1())
                .split(jobTaskExecutor)
                .add(parallelFlow2())
                .build();
    }

    @Bean
    public Job parallelJob(TaskExecutor jobTaskExecutor) {
        return jobBuilderFactory.get("sync_async_investigation_test")
                .incrementer(new RunIdIncrementer())
                .start(syncFlow())
                .next(wrapperFlow(jobTaskExecutor))
                .end()
                .build();
    }

    public static class CountingTasklet implements Tasklet {
        private final Logger logger = LoggerFactory.getLogger(CountingTasklet.class);

        @Override
        public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
            logger.info("BEFORE {} has been executed on thread {}", chunkContext.getStepContext().getStepName(), Thread.currentThread().getName());
            Thread.sleep(5000);
            logger.info("AFTER {} has been executed on thread {}", chunkContext.getStepContext().getStepName(), Thread.currentThread().getName());
            return RepeatStatus.FINISHED;
        }
    }
}

But at this case sync_flow_step1 and sync_flow_step2 are executed in the main thread. So I tried to find a place where to provide the taskExecutor and I did such small change":

@Bean
public Flow syncFlow() {
    return new FlowBuilder<Flow>("sync_flow")
            .start(stepBuilderFactory.get("sync_flow_step1")
                    .tasklet(tasklet())
                    .build())
            .next(stepBuilderFactory.get("sync_flow_step2")
                    .tasklet(tasklet())
                    .taskExecutor(new SimpleAsyncTaskExecutor())  // <-- this line was added
                    .build())
            .build();
}

After this addition I see that sync_flow_step2 became execute 4 times(please search for "BEFORE sync_flow_step2"):

2019-08-08 16:15:23.659  INFO 12436 --- [           main] h.w.c.ParallelFlowConfig$CountingTasklet : BEFORE sync_flow_step1 has been executed on thread main
2019-08-08 16:15:28.659  INFO 12436 --- [           main] h.w.c.ParallelFlowConfig$CountingTasklet : AFTER sync_flow_step1 has been executed on thread main
2019-08-08 16:15:28.659 DEBUG 12436 --- [           main] o.s.batch.core.step.tasklet.TaskletStep  : Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
2019-08-08 16:15:28.661 DEBUG 12436 --- [           main] o.s.batch.core.step.tasklet.TaskletStep  : Saving step execution before commit: StepExecution: id=6925, version=1, name=sync_flow_step1, status=STARTED, exitStatus=EXECUTING, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0, exitDescription=
2019-08-08 16:15:28.663 DEBUG 12436 --- [           main] o.s.batch.core.step.AbstractStep         : Step execution success: id=6925
2019-08-08 16:15:28.667 DEBUG 12436 --- [           main] o.s.batch.core.step.AbstractStep         : Step execution complete: StepExecution: id=6925, version=3, name=sync_flow_step1, status=COMPLETED, exitStatus=COMPLETED, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0
2019-08-08 16:15:28.669 DEBUG 12436 --- [           main] o.s.b.core.job.flow.support.SimpleFlow   : Completed state=sync_flow.sync_flow_step1 with status=COMPLETED
2019-08-08 16:15:28.669 DEBUG 12436 --- [           main] o.s.b.core.job.flow.support.SimpleFlow   : Handling state=sync_flow.sync_flow_step2
2019-08-08 16:15:28.684  INFO 12436 --- [           main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [sync_flow_step2]
2019-08-08 16:15:28.684 DEBUG 12436 --- [           main] o.s.batch.core.step.AbstractStep         : Executing: id=6926
2019-08-08 16:15:28.687 DEBUG 12436 --- [           main] o.s.b.r.s.TaskExecutorRepeatTemplate     : Starting repeat context.
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-2] o.s.b.r.s.TaskExecutorRepeatTemplate     : Repeat operation about to start at count=4
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-1] o.s.b.r.s.TaskExecutorRepeatTemplate     : Repeat operation about to start at count=4
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-2] o.s.b.c.s.c.StepContextRepeatCallback    : Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@6b7075ed
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-2] o.s.b.c.s.c.StepContextRepeatCallback    : Chunk execution starting: queue size=0
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-1] o.s.b.c.s.c.StepContextRepeatCallback    : Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@6b7075ed
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-1] o.s.b.c.s.c.StepContextRepeatCallback    : Chunk execution starting: queue size=0
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-4] o.s.b.r.s.TaskExecutorRepeatTemplate     : Repeat operation about to start at count=4
2019-08-08 16:15:28.692  INFO 12436 --- [cTaskExecutor-2] h.w.c.ParallelFlowConfig$CountingTasklet : BEFORE sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-2
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-4] o.s.b.c.s.c.StepContextRepeatCallback    : Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@6b7075ed
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-4] o.s.b.c.s.c.StepContextRepeatCallback    : Chunk execution starting: queue size=0
2019-08-08 16:15:28.692 DEBUG 12436 --- [cTaskExecutor-3] o.s.b.r.s.TaskExecutorRepeatTemplate     : Repeat operation about to start at count=4
2019-08-08 16:15:28.693  INFO 12436 --- [cTaskExecutor-1] h.w.c.ParallelFlowConfig$CountingTasklet : BEFORE sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-1
2019-08-08 16:15:28.693  INFO 12436 --- [cTaskExecutor-4] h.w.c.ParallelFlowConfig$CountingTasklet : BEFORE sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-4
2019-08-08 16:15:28.693 DEBUG 12436 --- [cTaskExecutor-3] o.s.b.c.s.c.StepContextRepeatCallback    : Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@6b7075ed
2019-08-08 16:15:28.693 DEBUG 12436 --- [cTaskExecutor-3] o.s.b.c.s.c.StepContextRepeatCallback    : Chunk execution starting: queue size=0
2019-08-08 16:15:28.694  INFO 12436 --- [cTaskExecutor-3] h.w.c.ParallelFlowConfig$CountingTasklet : BEFORE sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-3
2019-08-08 16:15:33.692  INFO 12436 --- [cTaskExecutor-2] h.w.c.ParallelFlowConfig$CountingTasklet : AFTER sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-2
2019-08-08 16:15:33.692 DEBUG 12436 --- [cTaskExecutor-2] o.s.batch.core.step.tasklet.TaskletStep  : Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
2019-08-08 16:15:33.693  INFO 12436 --- [cTaskExecutor-1] h.w.c.ParallelFlowConfig$CountingTasklet : AFTER sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-1
2019-08-08 16:15:33.693  INFO 12436 --- [cTaskExecutor-4] h.w.c.ParallelFlowConfig$CountingTasklet : AFTER sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-4
2019-08-08 16:15:33.694  INFO 12436 --- [cTaskExecutor-3] h.w.c.ParallelFlowConfig$CountingTasklet : AFTER sync_flow_step2 has been executed on thread SimpleAsyncTaskExecutor-3

Why does it happen? why 4 ?

How can I achive running sync_flow on a separated threadPool but without step execution duplication ?

1条回答
爷、活的狠高调
2楼-- · 2019-08-20 12:47

Some how you involvied TaskExecutorRepeatTemplate according to your log. And this class has a default limit which is 4.

public class org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate extends org.springframework.batch.repeat.support.RepeatTemplate {

  // Field descriptor #47 I
  public static final int DEFAULT_THROTTLE_LIMIT = 4;
查看更多
登录 后发表回答