2

We've set up a Spring Batch batch email-sender job that does the following:

  1. Read incoming email details from a JMS Queue.
  2. Don't do any processing, just pass them through...
  3. Turn the details into emails and "write" them to SMTP
  4. Repeat ad infinitum

This generally works fine, but we've discovered a massive memory leak. After 9 hours or so, can see a single FlowJob, but 748 JobExecution (all 'STARTED'), each holding 778 (!) StepExecution instances. All in all, 900 MB of stuff.

Here's the Spring config (Spring 3.1, Spring Batch 2.1.9):

<bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />

<bean id="jobLauncher"
    class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
    <property name="taskExecutor">
        <bean class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>
    </property>
</bean>

<bean id="jmsEmailFetcher" class="org.springframework.batch.item.jms.JmsItemReader">
    <property name="jmsTemplate" ref="batchEmailJmsTemplate" />
</bean>

<bean id="passthroughProcessor" class="org.springframework.batch.item.support.PassThroughItemProcessor" />

<bean id="transactionManager"
    class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

<!-- The Spring Batch *Limiter/Decider* -->

<bean id="ourLimitDecider" class="our.special.InfiniteThrottledExecutionDecider" />

<!-- The Spring Batch *Job* -->

<batch:job id="fetchEmailsJobBatch" restartable="true">
    <batch:step id="fetchEmailsStep" next="limitDecision">
        <batch:tasklet throttle-limit="10">
            <batch:chunk reader="jmsEmailFetcher" processor="passthroughProcessor"
                         writer="batchEmailService.javaMailItemWriter" commit-interval="100" skip-limit="999999999">  <!-- Might gets *lots* of MQ-not-up-yet fails -->
                <batch:skippable-exception-classes>
                    <batch:include class="org.springframework.jms.JmsException" />       <!-- Generally MQ-not-up-yet ConnectionException, or Session-closed (in tests) -->
                    <batch:include class="java.lang.IllegalStateException" />            <!-- Yuk, usually/presumably a test SMTP server isn't yet connected -->
                    <batch:include class="org.springframework.mail.MailSendException" /> <!-- SMTP down... -->
                </batch:skippable-exception-classes>
            </batch:chunk>
        </batch:tasklet>
    </batch:step>
    <batch:decision id="limitDecision" decider="ourLimitDecider">
        <batch:next on="CONTINUE" to="fetchEmailsStep" />
        <batch:end on="COMPLETED" />
    </batch:decision>
</batch:job>

Our InfiniteThrottledExecutionDecider basically returns new FlowExecutionStatus("CONTINUE") every time, to ensure that the fetchEmailsStep gets executed at the end of the flow, and that the Job never completes - at least not until we're ready to stop it ourselves.

We're not using a database, so we expect some stuff to be held in memory, but not a complete record of everything that's ever been run...

Is there something wrong with our configuration? Or our approach?


Here's a bit from our log file for what we're told is Step #778 and what should be the one and only Job instance.

23:58:18,782 - INFO  (org.springframework.batch.core.job.SimpleStepHandler) - Duplicate step [fetchEmailsStep] detected in execution of job=[fetchEmailsJobBatch]. If either step fails, both will be executed again on restart.
23:59:52,257 - INFO  (org.springframework.batch.core.job.SimpleStepHandler) - Executing step: [fetchEmailsStep]
23:59:52,257 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Executing: id=778
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Starting repeat context.
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat operation about to start at count=1
23:59:52,259 - DEBUG (org.springframework.batch.core.scope.context.StepContextRepeatCallback) - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@1be1ee
23:59:52,259 - DEBUG (org.springframework.batch.core.scope.context.StepContextRepeatCallback) - Chunk execution starting: queue size=0
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Starting repeat context.
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat operation about to start at count=1
... 5 second JMS timeout ...
23:59:57,716 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat is complete according to policy and result value.
23:59:57,716 - DEBUG (org.springframework.batch.core.step.item.ChunkOrientedTasklet) - Inputs not busy, ended: true
23:59:57,716 - DEBUG (org.springframework.batch.core.step.tasklet.TaskletStep) - Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
23:59:57,719 - DEBUG (org.springframework.batch.core.step.tasklet.TaskletStep) - Saving step execution before commit: StepExecution: id=778, version=1, name=fetchEmailsStep, status=STARTED, exitStatus=EXECUTING, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0, exitDescription=
23:59:57,721 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat is complete according to policy and result value.
23:59:57,721 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Step execution success: id=778
23:59:57,722 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Step execution complete: StepExecution: id=778, version=3, name=fetchEmailsStep, status=COMPLETED, exitStatus=COMPLETED, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Completed state=fetchEmailsJobBatch.fetchEmailsStep with status=COMPLETED
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Handling state=fetchEmailsJobBatch.limitDecision100
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Completed state=fetchEmailsJobBatch.limitDecision100 with status=CONTINUE
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Handling state=fetchEmailsJobBatch.fetchEmailsStep

Problem is, the heap dump shows 748 JobExecutions, and 581,911 StepExecutions. Where are they all coming from?

4

1 回答 1

1

好的,在回答我的问题时,解决方案是放弃 Step Flow 并使“fetchEmailsS​​tep”永远运行。

我们通过删除JmsTemplate("batchEmailJmsTemplate") 的 "readTimeout" 和子类org.springframework.batch.item.jms.JmsItemReader化以删除它为阻止您使用无限JmsTemplates 所做的检查来实现这一点 - 并添加额外的日志记录。如果您不想要覆盖,您可以将“readTimeout”设置为一个巨大的数字。

更多讨论在这里:http: //forum.springsource.org/showthread.php?132468-Massive-memory-use-with-FlowJob-repeating-Steps &p=431279

我仍然不知道为什么内存使用量如此之大......


[编辑]我也在使用JobExecutionDecider("our.special.InfiniteThrottledExecutionDecider") 来实现节流(基于<StepExecution>.writeCount),但这不得不去。我们现在使用扩展StepExecutionListenerSupport(或者您可以实现StepExecutionListener)的新类来实现相同的结果。这个 bean 的一个实例作为 a 添加<batch:listener>到我们的单<batch:step>

于 2012-11-22T09:25:51.633 回答