序
本文主要研究一下jdbc的batch的使用以及jpa的batch設定
batch
statement的batch操作,可以批量進行insert或update操作,提升操作效能,特別是在大資料量的insert或update的時候。
使用方式
@Test
public void testSqlInjectSafeBatch(){
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection conn = null;
PreparedStatement pstmt = null;
try{
conn = dataSource.getConnection();
conn.setAutoCommit(false);
pstmt = conn.prepareStatement(sql);
for (int i=0;i<3;i++) {
pstmt.setString(1,"name"+i);
pstmt.setString(2,"city"+i);
pstmt.setString(3,"iphone"+i);
pstmt.addBatch();
}
pstmt.executeBatch();
conn.commit();
}catch (SQLException e){
e.printStackTrace();
try {
conn.rollback();
} catch (SQLException e1) {
e1.printStackTrace();
}
}finally {
DbUtils.closeQuietly(pstmt);
DbUtils.closeQuietly(conn);
}
}
複製程式碼
主要就是每條操作引數設定完之後,呼叫addBatch方法,然後再所有操作都pstmt.addBatch()完之後,呼叫pstmt.executeBatch() 這種方式有個缺陷就是資料量大容易消耗記憶體,因此建議再分批次處理
@Test
public void testSqlInjectSafeAndOOMSafeBatch(){
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection conn = null;
PreparedStatement pstmt = null;
final int batchSize = 1000;
int count = 0;
try{
conn = dataSource.getConnection();
pstmt = conn.prepareStatement(sql);
for (int i=0;i<10000;i++) {
pstmt.setString(1,"name"+i);
pstmt.setString(2,"city"+i);
pstmt.setString(3,"iphone"+i);
pstmt.addBatch();
//小批量提交,避免OOM
if(++count % batchSize == 0) {
pstmt.executeBatch();
}
}
pstmt.executeBatch(); //提交剩餘的資料
}catch (SQLException e){
e.printStackTrace();
}finally {
DbUtils.closeQuietly(pstmt);
DbUtils.closeQuietly(conn);
}
}
複製程式碼
jpa的batch設定
spring:
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update
naming:
implicit-strategy: org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy
physical-strategy: org.springframework.boot.orm.jpa.hibernate.SpringPhysicalNamingStrategy
show-sql: true
properties:
hibernate:
format_sql: true
jdbc:
batch_size: 5000
batch_versioned_data: true
order_inserts: true
order_updates: true
複製程式碼
通過設定spring.jpa.properties.hibernate.jdbc.batch_size來設定批量
例項測試
@Test
public void testJpaBatch() {
List<DemoUser> demoUsers = new ArrayList<>();
for(int i=0;i<10;i++){
DemoUser demoUser = new DemoUser();
demoUser.setPrincipal("demo");
demoUser.setAccessToken(UUID.randomUUID().toString());
demoUser.setAuthType(UUID.randomUUID().toString());
demoUser.setDeptName(UUID.randomUUID().toString());
demoUser.setOrgName(UUID.randomUUID().toString());
demoUsers.add(demoUser);
}
StopWatch stopWatch = new StopWatch("jpa batch");
stopWatch.start();
demoUserDao.save(demoUsers);
stopWatch.stop();
System.out.println(stopWatch.prettyPrint());
}
複製程式碼
調整batch_size引數的測試結果
沒有設定批量
* StopWatch 'jpa batch': running time (millis) = 21383
-----------------------------------------
ms % Task name
-----------------------------------------
21383 100%
設定批量500
StopWatch 'jpa batch': running time (millis) = 16790
-----------------------------------------
ms % Task name
-----------------------------------------
16790 100%
批量1000
StopWatch 'jpa batch': running time (millis) = 12317
-----------------------------------------
ms % Task name
-----------------------------------------
12317 100%
批量5000
StopWatch 'jpa batch': running time (millis) = 13190
-----------------------------------------
ms % Task name
-----------------------------------------
13190 100%
複製程式碼
小結
jdbc的batch引數對於大資料量的新增/更新操作來說,非常有用,可以提升批量操作的效率。