深入分析Mybatis 使用useGeneratedKeys獲取自增主鍵

方丈的寺院發表於2019-09-07

原文網址 : https://juejin.im/post/5d733232f265da03d60f23db

摘要

我們經常使用useGenerateKeys來返回自增主鍵，避免多一次查詢。也會經常使用on duplicate key update，來進行insertOrUpdate，來避免先query 在insert/update。用起來很爽，但是經常踩坑，還不知為何。本篇就是深入分析獲取自增主鍵的原理。

問題

首先摘兩段我司一些老程式碼的bug

批量插入使用者收藏

for (tries = 0; tries < MAX_RETRY; tries++) {
    final int result = collectionMapper.insertCollections(collections);
    if (result == collections.size()) {
        break;
    }
}
if (tries == MAX_RETRY) {
    throw new RuntimeSqlException("Insert collections error");
}
// 依賴資料庫生成的collectionid
return collections;
複製程式碼

collectionMapper.insertCollections 方法

<insert id="insertCollections" parameterType="list" useGeneratedKeys="true"
        keyProperty="collectionId">
    INSERT INTO collection(
    userid, item
    )
    VALUES
    <foreach collection="list" item="collection" separator=",">
        (#{collection.userId}, #{collection.item})
    </foreach>
    ON DUPLICATE KEY UPDATE
    status = 0
</insert>
複製程式碼

不知道大家能不能發現其中的問題

分析

問題有兩個

返回值result的判斷錯誤

使用on duplicate key 批量update返回影響的行數是和插入的數不一樣的。犯這種錯主要在於想當然，不看文件看下官網文件寫的很清楚

With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to the mysql_real_connect() C API function when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.

返回值有三種 0: 沒有更新 1 ：insert 2. update 還有一個特殊情況，update 一個相同值到原來的值，這個根據客戶端配置，可能為0，可能為1。

所以這個判斷明顯錯誤

利用批量InsertOrUpdate的userGeneratedKey來返回自增主鍵

這個問題批量插入時有update語句時，就會發現有問題。返回的自增主鍵都是錯的，這是為什麼呢？

1. 首先我們看下mybatis對於useGeneratedKey的描述

>This tells MyBatis to use the JDBC getGeneratedKeys method to retrieve keys generated internally by the database (e.g. auto increment fields in RDBMS like MySQL or SQL Server). Default: false.

就是使用JDBC的getGeneratedKeys的方法來獲取的。
複製程式碼

2. 我們再找下JDBC的規範

Before version 3.0 of the JDBC API, there was no standard way of retrieving key values from databases that supported auto increment or identity columns. With older JDBC drivers for MySQL, you could always use a MySQL-specific method on the Statement interface, or issue the query SELECT LAST_INSERT_ID() after issuing an INSERT to a table that had an AUTO_INCREMENT key. Using the MySQL-specific method call isn't portable, and issuing a SELECT to get the AUTO_INCREMENT key's value requires another round-trip to the database, which isn't as efficient as possible. The following code snippets demonstrate the three different ways to retrieve AUTO_INCREMENT values. First, we demonstrate the use of the new JDBC 3.0 method getGeneratedKeys() which is now the preferred method to use if you need to retrieve AUTO_INCREMENT keys and have access to JDBC 3.0. The second example shows how you can retrieve the same value using a standard SELECT LAST_INSERT_ID() query. The final example shows how updatable result sets can retrieve the AUTO_INCREMENT value when using the insertRow() method.

意思就是JDBC3.0以前，有些亂七八糟的定義的，沒有統一，之後統一成了getGeneratedKeys()方法。兩邊是一致的。實現的原理主要就是資料庫端返回一個LAST_INSERT_ID。這個跟auto_increment_id強相關。

我們看下auto_increment_id的定義。重點關注批量插入

For a multiple-row insert, LAST_INSERT_ID() and mysql_insert_id() actually return the AUTO_INCREMENT key from the first of the inserted rows. This enables multiple-row inserts to be reproduced correctly on other servers in a replication setup.

批量插入的時候只會返回一個id，這個id值是第一個插入行的AUTO_INCREMENT值。至於為什麼這麼幹，能夠使得mysql-server在master-slave架構下也能保證id值統一的原因可以看下這篇。本篇文章就不展開了。

那麼mysql server只返回一個id，客戶端批量插入的時候怎麼能實現獲取全部的id呢

3. 客戶端的實現

我們看下客戶端getGeneratedKeys的實現。

JDBC com.mysql.jdbc.StatementImpl

public synchronized ResultSet getGeneratedKeys() throws SQLException {
       if (!this.retrieveGeneratedKeys) {
           throw SQLError.createSQLException(Messages.getString("Statement.GeneratedKeysNotRequested"), "S1009", this.getExceptionInterceptor());
       } else if (this.batchedGeneratedKeys == null) {
           // 批量走這邊的邏輯
           return this.lastQueryIsOnDupKeyUpdate ? this.getGeneratedKeysInternal(1) : this.getGeneratedKeysInternal();
       } else {
           Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)};
           fields[0].setConnection(this.connection);
           return ResultSetImpl.getInstance(this.currentCatalog, fields, new RowDataStatic(this.batchedGeneratedKeys), this.connection, this, false);
       }
   }
複製程式碼

看下呼叫的方法 this.getGeneratedKeysInternal()

protected ResultSet getGeneratedKeysInternal() throws SQLException {
        // 獲取影響的行數
        int numKeys = this.getUpdateCount();
        return this.getGeneratedKeysInternal(numKeys);
    }
複製程式碼

這裡有個重要知識點了，首先獲取本次批量插入的影響行數，然後再執行具體的獲取id操作。

getGeneratedKeysInternal方法


protected synchronized ResultSet getGeneratedKeysInternal(int numKeys) throws SQLException {
       Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)};
       fields[0].setConnection(this.connection);
       fields[0].setUseOldNameMetadata(true);
       ArrayList rowSet = new ArrayList();
       long beginAt = this.getLastInsertID();
        // 按照受影響的範圍+遞增步長
        for(int i = 0; i < numKeys; ++i) {
              if (beginAt > 0L) {
                       // 值塞進去
                       row[0] = StringUtils.getBytes(Long.toString(beginAt));
                   }
            beginAt += (long)this.connection.getAutoIncrementIncrement();
        }
}
複製程式碼

迭代影響的行數，然後依次獲取id。

所以批量insert是正確可以返回的。但是批量insertOrUpdate就有問題了，批量insertOrUpdate的影響行數不是插入的資料行數，可能是0，1，2這樣就導致了自增id有問題了。

比如插入3條資料，2條會update,1條會insert,這時候updateCount就是5，generateid就會5個了，mybatis然後取前3個塞到資料裡，顯然是錯的。

以上是原理分析，如果想了解更詳細的實驗結果，可以看下實驗

總結

批量insert

<insert id="insertAuthor" useGeneratedKeys="true"
    keyProperty="id">
  insert into Author (username, password, email, bio) values
  <foreach item="item" collection="list" separator=",">
    (#{item.username}, #{item.password}, #{item.email}, #{item.bio})
  </foreach>
</insert>
複製程式碼

來自官網的例子，mapper中不能指定@Param引數，否則會有問題