Skip to content

Commit 41f0fc8

Browse files
committed
Reinstate FAQ
Reinstated from https://docs.spring.io/spring-batch/docs/2.2.x/faq.html and will be updated incrementally over time. Resolves #3878
1 parent c6497b6 commit 41f0fc8

File tree

3 files changed

+68
-0
lines changed

3 files changed

+68
-0
lines changed

spring-batch-docs/modules/ROOT/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,4 @@
6060
** xref:schema-appendix.adoc[]
6161
** xref:transaction-appendix.adoc[]
6262
** xref:glossary.adoc[]
63+
** xref:faq.adoc[]
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
[[faq]]
2+
= Frequently Asked Questions
3+
4+
== Is it possible to execute jobs in multiple threads or multiple processes?
5+
6+
There are three ways to approach this - but we recommend exercising caution in the analysis of such requirements (is it really necessary?).
7+
8+
* Add a `TaskExecutor` to the step. The `StepBuilder`s provided for configuring Steps have a "taskExecutor" property you can set.This works as long as the step is intrinsically restartable (idempotent effectively). The parallel job sample shows how it might work in practice - this uses a "process indicator" pattern to mark input records as complete, inside the business transaction.
9+
* Use the `PartitionStep` to split your step execution explicitly amongst several Step instances. Spring Batch has a local multi-threaded implementation of the main strategy for this (`PartitionHandler`), which makes it a great choice for IO intensive jobs. Remember to use `scope="step"` for the stateful components in a step executing in this fashion, so that separate instances are created per step execution, and there is no cross talk between threads.
10+
* Use the Remote Chunking approach as implemented in the `spring-batch-integration` module. This requires some durable middleware (e.g. JMS) for reliable communication between the driving step and the remote workers. The basic idea is to use a special `ItemWriter` on the driving process, and a listener pattern on the worker processes (via a `ChunkProcessor`).
11+
12+
== How can I make an item reader thread safe?
13+
14+
You can synchronize the `read()` method (e.g. by wrapping it in a delegator that does the synchronization).
15+
Remember that you will lose restartability, so best practice is to mark the step as not restartable and to be safe (and efficient) you can also set `saveState=false` on the reader.
16+
17+
== What is the Spring Batch philosophy on the use of flexible strategies and default implementations? Can you add a public getter for this or that property?
18+
19+
There are many extension points in Spring Batch for the framework developer (as opposed to the implementor of business logic).
20+
We expect clients to create their own more specific strategies that can be plugged in to control things like commit intervals ( `CompletionPolicy` ),
21+
rules about how to deal with exceptions ( `ExceptionHandler` ), and many others.
22+
23+
In general we try to dissuade users from extending framework classes. The Java language doesn't give us as much flexibility to mark classes and interfaces as internal.
24+
Generally you can expect anything at the top level of the source tree in packages `org.springframework.batch.*` to be public, but not necessarily sub-classable.
25+
Extending our concrete implementations of most strategies is discouraged in favour of a composition or forking approach.
26+
If your code can use only the interfaces from Spring Batch, that gives you the greatest possible portability.
27+
28+
== How does Spring Batch differ from Quartz? Is there a place for them both in a solution?
29+
30+
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
31+
So Quartz could complement Spring Batch, but are not excluding technologies. A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression
32+
and the Spring Core convenience `SchedulerFactoryBean` .
33+
34+
== How do I schedule a job with Spring Batch?
35+
36+
Use a scheduling tool. There are plenty of them out there. Examples: Quartz, Control-M, Autosys.
37+
Quartz doesn't have all the features of Control-M or Autosys - it is supposed to be lightweight.
38+
If you want something even more lightweight you can just use the OS (`cron`, `at`, etc.).
39+
40+
Simple sequential dependencies can be implemented using the job-steps model of Spring Batch, and the non-sequential features in Spring Batch.
41+
We think this is quite common. And in fact it makes it easier to correct a common mis-use of schedulers - having hundreds of jobs configured,
42+
many of which are not independent, but only depend on one other.
43+
44+
== How does Spring Batch allow project to optimize for performance and scalability (through parallel processing or other)?
45+
46+
We see this as one of the roles of the `Job` or `Step`. A specific implementation of the Step deals with the concern of breaking apart the business logic
47+
and sharing it efficiently between parallel processes or processors (see `PartitionStep` ). There are a number of technologies that could play a role here.
48+
The essence is just a set of concurrent remote calls to distributed agents that can handle some business processing.
49+
Since the business processing is already typically modularised - e.g. input an item, process it - Spring Batch can strategise the distribution in a number of ways.
50+
One implementation that we have had some experience with is a set of remote web services handling the business processing.
51+
We send a specific range of primary keys for the inputs to each of a number of remote calls.
52+
The same basic strategy would work with any of the Spring Remoting protocols (plain RMI, HttpInvoker, JMS, Hessian etc.) with little more than a couple of lines change
53+
in the execution layer configuration.
54+
55+
== How can messaging be used to scale batch architectures?
56+
57+
There is a good deal of practical evidence from existing projects that a pipeline approach to batch processing is highly beneficial, leading to resilience and high throughput.
58+
We are often faced with mission-critical applications where audit trails are essential, and guaranteed processing is demanded, but where there are extremely tight limits
59+
on performance under load, or where high throughput gives a competitive advantage.
60+
61+
Matt Welsh's work shows that a Staged Event Driven Architecture (SEDA) has enormous benefits over more rigid processing architectures,
62+
and message-oriented middleware (JMS, AQ, MQ, Tibco etc.) gives us a lot of resilience out of the box. There are particular benefits in
63+
a system where there is feedback between downstream and upstream stages, so the number of consumers can be adjusted to account for the amount of demand.
64+
So how does this fit into Spring Batch? The spring-batch-integration project has this pattern implemented in Spring Integration,
65+
and can be used to scale up the remote processing of any step with many items to process.
66+
See in particular the "chunk" package, and the `ItemWriter` and `ChunkHandler` implementations in there.

spring-batch-docs/modules/ROOT/pages/index.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,5 @@ xref:transaction-appendix.adoc#transactions[Batch Processing and Transactions] :
4343
boundaries, propagation, and isolation levels used in Spring Batch.
4444
<<glossary.adoc#glossary,Glossary>> :: Glossary of common terms, concepts, and vocabulary of
4545
the Batch domain.
46+
<<faq.adoc#faq,Frequently Asked Questions>> :: Frequently Asked Questions about Spring Batch.
4647

0 commit comments

Comments
 (0)