1 入口
/* start log manager */
// 啟動日誌管理模組
logManager = LogManager(config, zkUtils, brokerState, kafkaScheduler, time, brokerTopicStats)
logManager.startup()
複製程式碼
2 開啟程式碼
/**
* Start the background threads to flush logs and do log cleanup
* 啟動後臺執行緒來沖洗日誌和日誌清理 依賴多執行緒
*/
def startup() {
/* Schedule the cleanup task to delete old logs */
if(scheduler != null) {
info("Starting log cleanup with a period of %d ms.".format(retentionCheckMs))
scheduler.schedule("kafka-log-retention",
cleanupLogs _,
delay = InitialTaskDelayMs,
period = retentionCheckMs,
TimeUnit.MILLISECONDS)
info("Starting log flusher with a default period of %d ms.".format(flushCheckMs))
scheduler.schedule("kafka-log-flusher",
flushDirtyLogs _,
delay = InitialTaskDelayMs,
period = flushCheckMs,
TimeUnit.MILLISECONDS)
scheduler.schedule("kafka-recovery-point-checkpoint",
checkpointRecoveryPointOffsets _,
delay = InitialTaskDelayMs,
period = flushRecoveryOffsetCheckpointMs,
TimeUnit.MILLISECONDS)
scheduler.schedule("kafka-log-start-offset-checkpoint",
checkpointLogStartOffsets _,
delay = InitialTaskDelayMs,
period = flushStartOffsetCheckpointMs,
TimeUnit.MILLISECONDS)
scheduler.schedule("kafka-delete-logs",
deleteLogs _,
delay = InitialTaskDelayMs,
period = defaultConfig.fileDeleteDelayMs,
TimeUnit.MILLISECONDS)
}
if(cleanerConfig.enableCleaner)
cleaner.startup()
}
複製程式碼
3核心程式碼
3.1 相關配置資訊
-
配置項log.cleaner.threads,預設值1.用於配置清理過期日誌的執行緒個數(用於日誌合併).
-
配置項log.cleaner.dedupe.buffer.size,預設值128MB,用於配置清理過期資料的記憶體緩衝區,這個用於資料清理時,選擇的壓縮方式時,用於對重複資料的清理排序記憶體,用於日誌合併.
-
配置項log.cleaner.io.buffer.load.factor,預設值0.9,用於配置清理記憶體緩衝區的資料裝載因子,主要是用於hash,這個因子越小,對桶的重複可能越小,但記憶體佔用越大,用於日誌合併.
-
配置項log.cleaner.io.buffer.size,預設值512KB,用於清理過期資料的IO緩衝區大小,用於日誌合併.
-
配置項message.max.bytes,預設值1000012位元組,用於設定單條資料的最大大小.
-
配置項log.cleaner.io.max.bytes.per.second,用於控制過期資料清理時的IO速度限制,預設不限制速度,用於日誌合併.
-
配置項log.cleaner.backoff.ms,用於定時檢查日誌是否需要清理的時間間隔(這個主要是在日誌合併時使用),預設是15秒.
-
配置項log.cleaner.enable,是否啟用日誌的定時清理,預設是啟用.
-
配置項num.recovery.threads.per.data.dir,用於在啟動時,用於日誌恢復的執行緒個數,預設是1.
-
配置項log.flush.scheduler.interval.ms,用於檢查日誌是否被flush到磁碟,預設不檢查.
-
配置項log.flush.offset.checkpoint.interval.ms,用於定時對partition的offset進行儲存的時間間隔,預設值60000ms.
-
配置項log.retention.check.interval.ms,定期檢查保留日誌的時間間隔,預設值5分鐘.
3.2 啟動步驟zk 模組
// 首先先在zk 讀取日誌 這塊就不多解釋了
val cleanerConfig = CleanerConfig(numThreads = config.logCleanerThreads,
dedupeBufferSize = config.logCleanerDedupeBufferSize,
dedupeBufferLoadFactor = config.logCleanerDedupeBufferLoadFactor,
ioBufferSize = config.logCleanerIoBufferSize,
maxMessageSize = config.messageMaxBytes,
maxIoBytesPerSecond = config.logCleanerIoMaxBytesPerSecond,
backOffMs = config.logCleanerBackoffMs,
enableCleaner = config.logCleanerEnable)
new LogManager(logDirs = config.logDirs.map(new File(_)).toArray,
topicConfigs = topicConfigs,
defaultConfig = defaultLogConfig,
cleanerConfig = cleanerConfig,
ioThreads = config.numRecoveryThreadsPerDataDir,
flushCheckMs = config.logFlushSchedulerIntervalMs,
flushRecoveryOffsetCheckpointMs = config.logFlushOffsetCheckpointIntervalMs,
flushStartOffsetCheckpointMs = config.logFlushStartOffsetCheckpointIntervalMs,
retentionCheckMs = config.logCleanupIntervalMs,
maxPidExpirationMs = config.transactionIdExpirationMs,
scheduler = kafkaScheduler,
brokerState = brokerState,
time = time,
brokerTopicStats = brokerTopicStats)
}
複製程式碼
3.3 啟動執行流程
threadsafe
class LogManager(val logDirs: Array[File],
val topicConfigs: Map[String, LogConfig], // note that this doesn't get updated after creation
val defaultConfig: LogConfig,
val cleanerConfig: CleanerConfig,
ioThreads: Int,
val flushCheckMs: Long,
val flushRecoveryOffsetCheckpointMs: Long,
val flushStartOffsetCheckpointMs: Long,
val retentionCheckMs: Long,
val maxPidExpirationMs: Int,
scheduler: Scheduler,
val brokerState: BrokerState,
brokerTopicStats: BrokerTopicStats,
time: Time) extends Logging {
val RecoveryPointCheckpointFile = "recovery-point-offset-checkpoint"
val LogStartOffsetCheckpointFile = "log-start-offset-checkpoint"
val LockFile = ".lock"
val InitialTaskDelayMs = 30*1000
private val logCreationOrDeletionLock = new Object
private val logs = new Pool[TopicPartition, Log]()
private val logsToBeDeleted = new LinkedBlockingQueue[Log]()
// 檢查日誌目錄是否被建立,如果沒有建立目錄,同時檢查目錄是否有讀寫的許可權.
createAndValidateLogDirs(logDirs)
// 生成每個目錄的.lock檔案,並通過這個檔案鎖定這個目錄.
private val dirLocks = lockLogDirs(logDirs)
// 根據每個目錄下的recovery-point-offset-checkpoint檔案,生成出checkpoints的集合.這個用於定期更新每個partition的offset記錄.
private val recoveryPointCheckpoints = logDirs.map(dir => (dir, new OffsetCheckpointFile(new File(dir, RecoveryPointCheckpointFile)))).toMap
private val logStartOffsetCheckpoints = logDirs.map(dir => (dir, new OffsetCheckpointFile(new File(dir, LogStartOffsetCheckpointFile)))).toMap
// 根據每一個目錄,生成一個執行緒池,執行緒池的大小是num.recovery.threads.per.data.dir配置的值,
// 讀取每個目錄下的topic-partitionid的目錄,並根據zk中針對此topic的配置檔案(或者預設的配置檔案),通過offset-checkpoint中記錄的此partition對應的offset,生成Log例項.並通過執行緒池來執行Log例項的載入,也就是日誌的恢復.
loadLogs()
// public, so we can access this from kafka.admin.DeleteTopicTest
val cleaner: LogCleaner =
if(cleanerConfig.enableCleaner)
new LogCleaner(cleanerConfig, logDirs, logs, time = time)
else
null
複製程式碼
3.4 清理過期日誌
/**
* Runs through the log removing segments older than a certain age
*/
private def cleanupExpiredSegments(log: Log): Int = {
if (log.config.retentionMs < 0)
return 0
val startMs = time.milliseconds
log.deleteOldSegments(startMs - _.lastModified > log.config.retentionMs)
}
複製程式碼
這塊又涉及到一個配置:retention.ms,這個參數列示日誌儲存的時間。如果小於0,表示永不失效,也就沒有了刪除這一說。
當然,如果檔案的修改時間跟當前時間差,大於設定的日誌儲存時間,就要執行刪除動作了。具體的刪除方法為:
/**
* Delete any log segments matching the given predicate function,
* starting with the oldest segment and moving forward until a segment doesn't match.
* @param predicate A function that takes in a single log segment and returns true iff it is deletable
* @return The number of segments deleted
*/
def deleteOldSegments(predicate: LogSegment => Boolean): Int = {
lock synchronized {
//find any segments that match the user-supplied predicate UNLESS it is the final segment
//and it is empty (since we would just end up re-creating it)
val lastEntry = segments.lastEntry
val deletable =
if (lastEntry == null) Seq.empty
else logSegments.takeWhile(s => predicate(s) && (s.baseOffset != lastEntry.getValue.baseOffset || s.size > 0))
val numToDelete = deletable.size
if (numToDelete > 0) {
// we must always have at least one segment, so if we are going to delete all the segments, create a new one first
if (segments.size == numToDelete)
roll()
// remove the segments for lookups
deletable.foreach(deleteSegment(_))
}
numToDelete
}
}
複製程式碼
這塊的邏輯是:根據傳入的predicate來判斷哪些日誌符合被刪除的要求,放入到deletable中,最後遍歷deletable,進行刪除操作。
private def deleteSegment(segment: LogSegment) {
info("Scheduling log segment %d for log %s for deletion.".format(segment.baseOffset, name))
lock synchronized {
segments.remove(segment.baseOffset)
asyncDeleteSegment(segment)
}
}
private def asyncDeleteSegment(segment: LogSegment) {
segment.changeFileSuffixes("", Log.DeletedFileSuffix)
def deleteSeg() {
info("Deleting segment %d from log %s.".format(segment.baseOffset, name))
segment.delete()
}
scheduler.schedule("delete-file", deleteSeg, delay = config.fileDeleteDelayMs)
}
複製程式碼
這塊是一個非同步刪除檔案的過程,包含一個配置:file.delete.delay.ms。表示每隔多久刪除一次日誌檔案。刪除的過程是先把日誌的字尾改為.delete,然後定時刪除。
3.5 清理過大日誌
/**
* Runs through the log removing segments until the size of the log
* is at least logRetentionSize bytes in size
*/
private def cleanupSegmentsToMaintainSize(log: Log): Int = {
if(log.config.retentionSize < 0 || log.size < log.config.retentionSize)
return 0
var diff = log.size - log.config.retentionSize
def shouldDelete(segment: LogSegment) = {
if(diff - segment.size >= 0) {
diff -= segment.size
true
} else {
false
}
}
log.deleteOldSegments(shouldDelete)
}
複製程式碼
這塊程式碼比較清晰,如果日誌大小大於retention.bytes,那麼就會被標記為待刪除,然後呼叫的方法是一樣的,也是deleteOldSegments。就不贅述了。
3.6 定期對log的磁碟緩衝區進行flush:
這個通過後臺的排程元件定期去執行LogManager中的flushDirtyLogs的函式,
這個函式中迭代所有的partition的log,並執行flush的操作,這個操作中通過當前最後一個offset找到上一次進行checkpoint的offset與當前的offset中間的segment,並執行segment中log與index的flush操作.對應log檔案執行檔案管道的force函式,對於index檔案,執行檔案管道map的force函式.
private def flushDirtyLogs() = {
debug("Checking for dirty logs to flush...")
for ((topicAndPartition, log) <- logs) {
try {
val timeSinceLastFlush = time.milliseconds - log.lastFlushTime
debug("Checking if flush is needed on " + topicAndPartition.topic
+ " flush interval " + log.config.flushMs +
" last flushed " + log.lastFlushTime + " time since last flush: "
+ timeSinceLastFlush)
if(timeSinceLastFlush >= log.config.flushMs)
log.flush
} catch {
case e: Throwable =>
error("Error flushing topic " + topicAndPartition.topic, e)
}
}
}
複製程式碼
3.7 定期對partition的offset進行checkpoint操作:
這個通過後臺的排程元件定期去
執行LogManager中的checkpointRecoveryPointOffsets的函式,
def checkpointRecoveryPointOffsets() {
this.logDirs.foreach(checkpointLogsInDir)
}
複製程式碼
這裡對每個dir中儲存的partition的最後一個offset進行checkpoint的操作.
在這個函式中,迭代每個dir中對應的partition的offset記錄到對應目錄下的checkpoint檔案中.
第一行寫入一個0,表示是checkpoint檔案的版本.
第二行寫入的是partition的個數,當前checkpoint時,這個dir已經存在資料的partition的個數.
後面對應第二個的值個數的條數的資料,每條資料寫入topic partition offset的值.
private def checkpointLogsInDir(dir: File): Unit = {
val recoveryPoints = this.logsByDir.get(dir.toString)
if (recoveryPoints.isDefined) {
this.recoveryPointCheckpoints(dir).write(recoveryPoints.get.mapValues(
_.recoveryPoint))
}
}
複製程式碼
LogCleaner例項中,定期執行的日誌壓縮: 這個例項中,通過CleanerThread的執行緒進行處理:
- 配置項log.cleaner.io.max.bytes.per.second,用於控制這個執行緒操作的IO速度,預設不控制速度
- 配置項log.cleaner.dedupe.buffer.size,預設值128MB,用於配置清理過期資料的記憶體緩衝區,這個用於資料清理時,選擇的壓縮方式時,用於對重複資料的清理排序記憶體.
- 配置項log.cleaner.threads,預設值1.用於配置清理過期日誌的執行緒個數.
- 配置項log.cleaner.backoff.ms,用於定時檢查日誌是否需要清理的時間間隔,預設是15秒.