TT#15: "Tech talk on Spring batch"
Satyam Barsainya
Staff Engineer @ Altimetrik | Ex-Publicis Sapient | Ex-Wipro | Java | Spring Boot | Flutter | HTML | CSS | Angular Specialist | API & Microservices Expert | Technical Blogger
?? Unlocking the Power of Spring Batch! ??
???? Analogy: Picture yourself as a chef in a bustling restaurant, tasked with preparing a feast for hundreds of guests. You wouldn’t attempt to cook all the dishes simultaneously. Instead, you’d divide the task into smaller parts or “batches”. You might chop all the vegetables first, then marinate the meat, and finally bake the bread. Each of these tasks is manageable on its own, and by tackling them sequentially, you can whip up a meal for hundreds without breaking a sweat.
?? Spring Batch operates on a similar principle. It’s a framework within the Spring ecosystem designed to handle data in large volumes. It’s used for batch processing - that is, processing data in “batches” or chunks, rather than all at once.
Here are the technical details:
By breaking down the task into manageable chunks and handling each chunk one at a time, Spring Batch enables efficient and reliable data processing. ??
?? Unlocking the Power of Spring Batch: Pros and Cons!
?? Pros:
?? Cons:
?? Unleashing the Power of Spring Batch: Real-world Marvels! ??
领英推荐
In these real-world scenarios, Spring Batch proves beneficial by providing a structured and scalable approach to handle large datasets, ensuring reliability, and facilitating the automation of complex business processes. ??
Below is the code snippet with detailed explanation. ??
@Configuration
@AllArgsConstructor
public class CustomerBatchJobConfig {
private final CustomerRepository customerRepository;
public FlatFileItemReader<Customer> itemReader() {
FlatFileItemReader<Customer> itemReader = new FlatFileItemReader<>();
itemReader.setResource(new FileSystemResource("src/main/resources/customers.csv"));
itemReader.setName("csv-reader");
itemReader.setLinesToSkip(1);
itemReader.setLineMapper(lineMapper());
return itemReader;
}
private LineMapper<Customer> lineMapper() {
DefaultLineMapper lineMapper = new DefaultLineMapper();
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter(",");
tokenizer.setNames("id", "name", "age", "country", "email");
tokenizer.setStrict(false);
BeanWrapperFieldSetMapper mapper = new BeanWrapperFieldSetMapper();
mapper.setTargetType(Customer.class);
lineMapper.setFieldSetMapper(mapper);
lineMapper.setLineTokenizer(tokenizer);
return lineMapper;
}
@Bean
public CustomerItemProcessor processor() {
return new CustomerItemProcessor();
}
@Bean
public RepositoryItemWriter<Customer> itemWriter() {
RepositoryItemWriter<Customer> writer = new RepositoryItemWriter<>();
writer.setRepository(customerRepository);
writer.setMethodName("save");
return writer;
}
@Bean
public Step step(JobRepository repository, PlatformTransactionManager transactionManager) {
return new StepBuilder("csv-step", repository)
.<Customer, Customer>chunk(2, transactionManager)
.reader(itemReader())
.processor(processor())
.writer(itemWriter())
// .taskExecutor(taskExecutor())
.build();
}
// Below method can be used if you want to run the batch concurrently
private TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor asyncTaskExecutor = new SimpleAsyncTaskExecutor();
asyncTaskExecutor.setConcurrencyLimit(10);
return asyncTaskExecutor;
}
@Bean
public Job job(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new JobBuilder("csv-job", jobRepository)
.flow(step(jobRepository, transactionManager))
.end()
.build();
}
}
@RestController
@AllArgsConstructor
public class CustomerBatchController {
private final JobLauncher jobLauncher;
@Autowired
private final Job job;
@GetMapping(value = "/startBatch")
public BatchStatus startBatch() throws JobInstanceAlreadyCompleteException, JobExecutionAlreadyRunningException, JobParametersInvalidException, JobRestartException {
JobParameters jobParameters = new JobParametersBuilder()
.addLong("startAt", System.currentTimeMillis())
.toJobParameters();
JobExecution run = jobLauncher.run(job, jobParameters);
return run.getStatus();
}
}
# H2 Database Configuration
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=admin
spring.datasource.password=admin
spring.h2.console.enabled=true
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
spring.batch.initialize-jobs=false
spring.batch.job.enabled=false
Explanation:
In summary, this code reads customer data from a CSV file, filters out customers from Canada, and writes the remaining customers to a database. The batch job can be started by making a GET request to the /startBatch endpoint. The job reads and writes data in chunks of 2 and can be configured to run tasks concurrently. The H2 database is used for storing the data and the job metadata. The CustomerItemProcessor class is used to implement the business logic of filtering out customers from Canada. The CustomerBatchController class exposes an endpoint to start the batch job. The startBatch() method uses JobLauncher to launch the job with the current timestamp as a parameter. The H2 database configuration is used to set up an in-memory database for testing purposes. The spring.batch.initialize-jobs and spring.batch.job.enabled properties are set to false to disable the automatic execution of jobs on startup.
?? I’m thrilled to share that I’ve created a ‘Hello World’ project that demonstrates the workings of Spring Batch! This project is a great starting point for anyone looking to understand the basics of Spring Batch. ??
You can find the repository for this project on GitLab. Feel free to clone, explore, and let the learning begin! Here’s the link to the repo: Spring Batch Example ??
Happy Coding! ??
?? Join the Conversation: Share this post with your friends and colleagues who are passionate about web development and tech innovation. Let's learn and grow together. Your network will thank you! ??