霸王餐订单导出:Spring Batch分页读取百万级数据内存溢出解决
问题复现
“吃喝不愁”运营后台需支持导出某活动下全部霸王餐订单(超100万条)。初期使用 JpaRepository.findAll() 一次性加载数据,导致堆内存迅速耗尽,Full GC 频繁后抛出 java.lang.OutOfMemoryError: Java heap space。
解决方案:Spring Batch + 分页游标读取
采用 Spring Batch 的 JpaPagingItemReader 实现分页流式读取,每页处理固定数量记录,避免全量加载。
实体与Repository定义
package juwatech.cn.entity;
import javax.persistence.*;
@Entity
@Table(name = "free_meal_order")
public class FreeMealOrder {
@Id
private Long id;
private String orderId;
private String userId;
private String restaurantName;
private String status;
// getters/setters omitted
}
package juwatech.cn.repository;
import juwatech.cn.entity.FreeMealOrder;
import org.springframework.data.jpa.repository.JpaRepository;
public interface FreeMealOrderRepository extends JpaRepository<FreeMealOrder, Long> {
}

配置JpaPagingItemReader
package juwatech.cn.config;
import juwatech.cn.entity.FreeMealOrder;
import org.springframework.batch.item.database.JpaPagingItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import javax.persistence.EntityManagerFactory;
@Configuration
public class BatchReaderConfig {
@Bean
public JpaPagingItemReader<FreeMealOrder> orderReader(EntityManagerFactory entityManagerFactory) {
JpaPagingItemReader<FreeMealOrder> reader = new JpaPagingItemReader<>();
reader.setEntityManagerFactory(entityManagerFactory);
reader.setQueryString("SELECT o FROM FreeMealOrder o WHERE o.status = 'SUCCESS'");
reader.setPageSize(500); // 每页500条,平衡IO与内存
reader.afterPropertiesSet();
return reader;
}
}
关键点:
setPageSize(500):控制每次从DB加载的记录数;- 使用 JPQL 而非原生SQL,确保与JPA兼容;
- 不可使用
ORDER BY RAND()或无主键排序,否则分页失效。
自定义ItemWriter(写入CSV)
package juwatech.cn.writer;
import juwatech.cn.entity.FreeMealOrder;
import org.springframework.batch.item.ItemWriter;
import org.springframework.stereotype.Component;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
@Component
public class CsvOrderWriter implements ItemWriter<FreeMealOrder> {
private final PrintWriter writer;
public CsvOrderWriter() throws IOException {
Files.createDirectories(Paths.get("exports"));
this.writer = new PrintWriter(new FileWriter("exports/free_meal_orders.csv", false));
writer.println("订单ID,用户ID,商户名称,状态");
}
@Override
public void write(List<? extends FreeMealOrder> items) {
for (FreeMealOrder order : items) {
writer.printf("%s,%s,%s,%s%n",
order.getOrderId(),
order.getUserId(),
order.getRestaurantName(),
order.getStatus()
);
writer.flush(); // 立即刷盘,防止OOM时丢失数据
}
}
public void close() {
writer.close();
}
}
Job与Step配置
package juwatech.cn.config;
import juwatech.cn.entity.FreeMealOrder;
import juwatech.cn.writer.CsvOrderWriter;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.database.JpaPagingItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
@EnableBatchProcessing
public class BatchJobConfig {
@Bean
public Step exportOrderStep(
StepBuilderFactory stepBuilderFactory,
JpaPagingItemReader<FreeMealOrder> orderReader,
CsvOrderWriter orderWriter
) {
return stepBuilderFactory.get("exportOrderStep")
.<FreeMealOrder, FreeMealOrder>chunk(100) // 每100条提交一次事务
.reader(orderReader)
.writer(orderWriter)
.build();
}
@Bean
public Job exportOrderJob(JobBuilderFactory jobBuilderFactory, Step exportOrderStep) {
return jobBuilderFactory.get("exportOrderJob")
.start(exportOrderStep)
.build();
}
}
注意:chunk(100) 表示每处理100条记录执行一次 writer.write() 并提交事务,避免事务过大。
禁用一级缓存(关键!)
默认情况下,Hibernate 会将所有加载的实体缓存在 Persistence Context 中,即使分页也会导致内存累积。必须在 Reader 中关闭:
@Bean
public JpaPagingItemReader<FreeMealOrder> orderReader(EntityManagerFactory entityManagerFactory) {
JpaPagingItemReader<FreeMealOrder> reader = new JpaPagingItemReader<>() {
@Override
protected void doReadPage() {
super.doReadPage();
// 清理一级缓存,防止内存泄漏
getEntityManager().clear();
}
};
reader.setEntityManagerFactory(entityManagerFactory);
reader.setQueryString("SELECT o FROM FreeMealOrder o WHERE o.status = 'SUCCESS'");
reader.setPageSize(500);
reader.afterPropertiesSet();
return reader;
}
entityManager.clear() 在每页读取后清空上下文,确保旧对象可被GC回收。
JVM参数与性能验证
启动参数:
-Xms2g -Xmx2g -XX:+UseG1GC
实测结果:
- 数据量:1,200,000 条;
- 内存稳定在 1.3GB 左右;
- 导出耗时:8分12秒;
- 无 Full GC。
若需更高吞吐,可调整:
pageSize=1000(需测试DB压力);- 使用
FlatFileItemWriter替代自定义 Writer,性能更优。
本文著作权归吃喝不愁app开发者团队,转载请注明出处!

被折叠的 条评论
为什么被折叠?



