librdkafka: 如何設定Kafka消費者訂閱訊息的起始偏移位置

劉近光發表於2019-02-28

預設配置

預設情況下,Kafka消費者從最後一次提交的偏移量位置(offset)開始消費訊息,如果Topic+Partition和Group之前沒有提交過偏移量,它訂閱訊息開始位置取決於Topic的配置屬性auto.offset.reset的設定。預設為最新(latest),也就是在分割槽末尾開始消耗(僅消費新訊息)。相關配置可以參考官方文件:https://kafka.apache.org/documentation/#topicconfigs
方便查閱,截個圖:
在這裡插入圖片描述

相關介面資訊

librdkafka提供了assign() API,通過設定rd_kafka_topic_partition_t的.offset屬性,你可以指定每一個Partition的起始偏移。偏移量可以是一個絕對的偏移(>0),或邏輯偏移 (BEGINNING, END, STORED, TAIL(…))。
rdkafka.h標頭檔案中定義了Partition的管理結構rd_kafka_topic_partition_t,包含offset資訊;同時提供了邏輯偏移的定義RD_KAFKA_OFFSET_XXX。

/**
 * @brief Generic place holder for a specific Topic+Partition.
 *
 * @sa rd_kafka_topic_partition_list_new()
 */
typedef struct rd_kafka_topic_partition_s {
        char        *topic;             /**< Topic name */
        int32_t      partition;         /**< Partition */
	    int64_t      offset;            /**< Offset */
        void        *metadata;          /**< Metadata */
        size_t       metadata_size;     /**< Metadata size */
        void        *opaque;            /**< Application opaque */
        rd_kafka_resp_err_t err;        /**< Error code, depending on use. */
        void       *_private;           /**< INTERNAL USE ONLY,
                                         *   INITIALIZE TO ZERO, DO NOT TOUCH */
} rd_kafka_topic_partition_t;

////////////////////////////////////////////////////////////
#define RD_KAFKA_OFFSET_BEGINNING -2  /**< Start consuming from beginning of
				       *   kafka partition queue: oldest msg */
#define RD_KAFKA_OFFSET_END       -1  /**< Start consuming from end of kafka
				       *   partition queue: next msg */
#define RD_KAFKA_OFFSET_STORED -1000  /**< Start consuming from offset retrieved
				       *   from offset store */
#define RD_KAFKA_OFFSET_INVALID -1001 /**< Invalid offset */


/** @cond NO_DOC */
#define RD_KAFKA_OFFSET_TAIL_BASE -2000 /* internal: do not use */
/** @endcond */

/**
 * @brief Start consuming \p CNT messages from topic's current end offset.
 *
 * That is, if current end offset is 12345 and \p CNT is 200, it will start
 * consuming from offset \c 12345-200 = \c 12145. */
#define RD_KAFKA_OFFSET_TAIL(CNT)  (RD_KAFKA_OFFSET_TAIL_BASE - (CNT))

通過rd_kafka_assign()函式介面可以配置需要消費的Partition資訊。

/**
 * @brief Atomic assignment of partitions to consume.
 *
 * The new \p partitions will replace the existing assignment.
 *
 * When used from a rebalance callback the application shall pass the
 * partition list passed to the callback (or a copy of it) (even if the list
 * is empty) rather than NULL to maintain internal join state.

 * A zero-length \p partitions will treat the partitions as a valid,
 * albeit empty, assignment, and maintain internal state, while a \c NULL
 * value for \p partitions will reset and clear the internal state.
 */
RD_EXPORT rd_kafka_resp_err_t
rd_kafka_assign (rd_kafka_t *rk,
                 const rd_kafka_topic_partition_list_t *partitions);

如何配置offset

對於消費者來說,有兩個場景來修改訂閱的Parttion offset資訊:一是系統初始化時直接指定offset資訊,二是消費者群組重平衡(rebalance)的回撥函式。接下來分別介紹一下。
系統初始化時,指定offset示例:

rd_kafka_topic_partition_list_t *partitions;
partitions = rd_kafka_topic_partition_list_new(0);
rd_kafka_topic_partition_list_add(partitions, "mytopic", 3)->offset = 1234;
rd_kafka_assign(rk, partitions);
rd_kafka_topic_partition_list_destroy(partitions);  

rebalance_cb()回撥函式中,指定offset示例。

void my_rebalance_cb (rd_kafka_t *rk, rd_kafka_resp_err_t err,
                      rd_kafka_topic_partition_list_t *partitions, void *opaque) {
   if (err == RD_KAFKA_RESP_ERR__ASSIGN_PARTITIONS) {
       rd_kafka_topic_partition_t *part;
       if ((part = rd_kafka_topic_partition_list_find(partitions, "mytopic", 3)))
           part->offset = 1234;
       rd_kafka_assign(rk, partitions);
   }  else {
       rd_kafka_assign(rk, NULL);
   }
}

歡迎訂閱個人公眾號

打個廣告,歡迎訂閱個人的公眾號,文章會在公眾號和部落格同步釋出。
在這裡插入圖片描述

相關文章