HMaster啟動流程簡析

weixin_34205076發表於2017-11-21

很多細節不便在本篇寫的過於詳細,等後續細節分析

0. HBaseCommandLine首先對HMaster進行初始化

0.1 檢查是否進行了IP繫結(https://issues.apache.org/jira/browse/HBASE-8148),獲取地址

0.2 通過HbaseRPC建立一個RPCServer

0.2.1 首先獲取RPCEngineWritableRPCEngine),並通過其對RPCServer進初始化(ServerHBaseServerRPCServer

0.2.1.1 初始化CallQueueipc.server.max.queue.size:向後相容,ipc.server.max.callqueue.length,預設值為handler* DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER)和ReplicationQueue(ipc.server.max.callqueue.size,預設值為1024×1024×1024),以及SizeBasedThrottlerthreshold=ipc.server.max.callqueue.size,ListenerResponder

0.2.1.1.1初始化Responder,建立一個selector,其中頻率為purgeTimeout(預設值2×DEFAULT_HBASE_RPC_TIMEOUT)

0.2.1.1.2初始化Listener,獲取監聽地址並繫結到ServerSocket,其中backlog length= ipc.server.listen.queue.size,並初始化一個大小為ipc.server.read.threadpool.sizethreadpool並對ipc.server.read.threadpool.sizeReader進行初始化並啟動,最後註冊連線事件;

0.3 對已經初始化完畢的RPCServer啟動

0.3.1 啟動Responder,從responseQueue取出response寫回,這裡Responder有一處類似HADOOP RPC的優化就是當responseQueue只有一個值時立刻響應.

0.3.2 啟動Listener:每10秒並且連線數超過ipc.client.idlethreshold則對ConnectionList進行一次檢查,若超時2×ipc.client.connection.maxidletime,則進行清理,最多清理ipc.client.kill.max(預設10個)

0.3.3 啟動Handler,從CallQueue取出Callrpcserver呼叫,並將返回值傳送到responder進行處理

013418729.png

0.4HMaster傳入ZookeeperWatcher初始化

0.4.1 初始化ZookeeperWatcher:通過Zkutil或者一個Zookeeper Client物件,其中sessionTimeout=zookeeper.session.timeout(預設180s),maxretry=zookeeper.recovery.retry以及retryIntervalMillis=zookeeper.recovery.retry.intervalmill

0.4.1.1 進入到Zookeeper環節(下回分解)

0.5 初始化Health Check Thread,檢查頻率為hbase.node.health.script.frequency預設10


1.HMaster執行startup進行啟動流程

1.1 呼叫becomeActiveMaster,進入阻塞狀態直到Active

1.1.1 初始化ActiveMasterManagerZookeeperListener

1.1.2 zookeeperWatcher註冊到ActiveMasterManager來進行監聽

1.1.3 stallIfBackupMastes略過不表

1.1.4 初始化ClusterStatusTrackerZookeeperNodeTracker,並啟動,註冊HMasterClusterStatusTracker

1.1.5 blockUntilBecomingActiveMasterAdd a ZNode for ourselves in the backup master directory since we are notthe active master.If we become the active master later, ActiveMasterManagerwill delete this node explicitly. If wecrash before then, ZooKeeper will delete this node for us since it isephemeral.

1.2 呼叫finishInitialization進入初始化完成階段

1.2.1 初始化filesystemManager:MasterFileSystem

1.2.1.1 如果開啟hbase.master.distributed.log.splitting,則初始化SplitLogManagerZookeeperListener

1.2.1.1.1

1.2.1.2 建立初始化目錄:檢查rootdir是否存在,檢查tempdir是否存在並清理,建立oldlogdir

1.2.2 初始化FSTableDescriptors->tableDescriptor

1.2.3 初始化ExecutorService

1.2.4 初始化ServerManager:其中是通過HConnectionManager獲取一個HConnection,其中連線池的大小為hbase.zookeeper.properties.maxClientCnXns(預設300)+1

1.2.5 初始化所有基於ZKtracker:

1.2.5.1初始化CatalogTracker

1.2.5.1.1 獲取一個HConnection

1.2.5.1.2 初始化RootRegionTrackerZookeeperNodeTrackerrootServerZnode

1.2.5.1.3 初始化MetaRegionTrackerZookeeperNodeTracker(assignmentnode/first_meta_region)

1.2.5.2 啟動CalalogTracker

1.2.5.2.1 啟動RootRegionTracker:開始track RR

1.2.5.2.2 啟動MetaRegionTracker: 開始trackMR

1.2.5.3 通過LoadBalancerFactory獲取balancer例項

1.2.5.4 初始化AssginmentManager,管理region的分配:包括初始化timeoutMonitorhbase.master.assignment.timeoutmonitor.period預設10shbase.master.assignment.timeoutmonitor.timeout預設30min),timerUpdaterhbase.master.assignment.timerupdater.period預設10s

1.2.5.5 zookeeperWatcher註冊到assginmentManager,並加到ListenerList的第一位

1.2.5.6 初始化RegionServerTracker...

1.2.5.7 啟動RegionServerTracker...

1.2.5.8初始化DrainingServerTracker...

1.2.5.9 啟動DrainingServerTracker...

1.2.5.10初始化SnapshotManager...

1.2.7 初始化MasterCoprocessorHost

1.2.8 啟動服務執行緒:包括MASTER_OPEN_REGIONhbase.master.executor.openregion.threads5),MASTER_CLOSE_REGIONhbase.master.executor.closeregion.threads5),MASTER_SERVER_OPERATIONShbase.master.executor.serverops.threads3),MASTER_META_SERVER_OPERATIONShbase.master.executor.serverops.threads5),MASTER_TABLE_OPERATIONS;以及初始化並執行LogCleanerHFileCleaner,最後啟動HealCheckChore,並且RPCServer開始接受請求

1.2.9 等待RS狀態彙報:等到以下三個條件滿足:

a.themaster is stopped

b.the'hbase.master.wait.on.regionservers.maxtostart' number of region servers is reached

c.the 'hbase.master.wait.on.regionservers.mintostart' is reached AND

there have been no new region serverin for 'hbase.master.wait.on.regionservers.interval預設1.5s' time AND

the'hbase.master.wait.on.regionservers.timeout預設4.5s'is reached

1.2.10 檢查哪些RS沒有註冊到ZK:將啟動的RS進行註冊並記錄到serverManager

1.2.11 啟動AssignManager:啟動TimeoutMonitor

1.2.12 進行一次splitlog操作:MasterFileSystem進行,掃描hlogdir檢視其所屬regionserver是否online,如果不線上則加入到splitlogManagerdeadWorkers列表並在Zk對所有的hlogsplitlog路徑下建立一個znode,等待其他RegionServerSplitlogWorker獲取任務後進行處理(細節見下篇RegionServer啟動流程),若關閉hbase.master.distributed.log.splitting,則由HMaster處理,此處不表

1.2.13 分配ROOTMATA region:檢查—ROOT—和.META.是否已經分配,若沒有則由AssignmentManager進行分配:

1.2.13.1

1.2.14 開啟shutdownHandler:由ServerManagerdeadNotExpiredServers進行過期檢查,對expiredServer進行處理並提交關閉流程到ExecutorService

1.2.15 AssginmentManager 進行JoinCluster:將InDisablingStateEnabingState的表進行恢復

1.2.16 fix region

1.2.17 啟動Balancer並交由Chore300s執行一次,並且是單執行緒執行:當Region正處於轉換或者RS正在下線則不進行balance

1.2.18 startCatalogJanitorChore啟動

1.2.16 執行postCP post Master startup

1.3 啟動Stop Check Thread,每秒檢查一次

完畢



本文轉自MIKE老畢 51CTO部落格,原文連結:http://blog.51cto.com/boylook/1312912,如需轉載請自行聯絡原作者


相關文章