[原始碼閱讀] 阿里SOFA服務註冊中心MetaServer(3)
0x00 摘要
SOFARegistry 是螞蟻金服開源的一個生產級、高時效、高可用的服務註冊中心。本系列將帶領大家一起分析其MetaServer的實現機制。本文為第三篇,介紹MetaServer如何基於raft實現了資料一致性。
因為篇幅限制,本文不會涉及 Raft 和 JRaft 的原理和實現,只是講解MetaServer如何基於 JRaft 的實現。
0x01 概念
1.1 分散式一致性
分散式一致性 (distributed consensus) 是分散式系統中最基本的問題,用來保證一個分散式系統的可靠性以及容災能力。
簡單的來講,就是如何在多個機器間對某一個值達成一致,並且當達成一致之後,無論之後這些機器間發生怎樣的故障,這個值能保持不變。抽象定義上, 一個分散式系統裡的所有程式要確定一個值 v,如果這個系統滿足如下幾個性質,就可以認為它解決了分散式一致性問題,幾個性質分別是 :
-
Termination: 所有正常程式都會決定 v 具體的值,不會存在一直在迴圈的程式。
-
Validity: 任何正常程式如果有一個確定的值 v’,那麼 v’ 肯定是某個程式提交的。比如隨機數生成器就不滿足這個性質。
-
Agreement: 所有正常程式選擇的值都是一樣的。
1.2 SOFAJRaft
SOFAJRaft 是一個基於 Raft 一致性演算法的生產級高效能 Java 實現,支援 MULTI-RAFT-GROUP,適用於高負載低延遲的場景。
因為 SOFARegistry 叢集節點列表資料並不是很多,因此不需要使用資料分片的方式在 MetaServer 中儲存。叢集節點列表儲存在 Repository 中,通過 Raft 強一致性協議對外提供節點註冊、續約、列表查詢等 Bolt 請求,從而保障叢集獲得的資料是強一致性的。
0x02 基礎架構
這裡的基礎架構指的是 :基於JRaft之上的,在SOFARegistry之中構建的基礎架構,包括StateMachine,Handler,RaftServer,RaftClient 等等。
2.1 RaftExchanger
Exchange 作為 Client / Server 連線的抽象,負責節點之間的連線。RaftExchanger就是Raft協議連線的抽象,可以看到其中包含配置,Registry和Raft元件。
public class RaftExchanger {
@Autowired
private MetaServerConfig metaServerConfig;
@Autowired
private NodeConfig nodeConfig;
@Autowired
private Registry metaServerRegistry;
private RaftServer raftServer;
private RaftClient raftClient;
private CliService cliService;
}
在系統啟動時候,會把Raft這幾個元件啟動。
private void initRaft() {
raftExchanger.startRaftServer(executorManager);
raftExchanger.startRaftClient();
raftExchanger.startCliService();
}
2.2 RaftServer
RaftServer是Raft協議的服務端,主要有如下成員或者行為:
- 啟動一個 raft node節點,提供分散式服務。
- 內部使用 jraft 提供的
RaftGroupService
服務框架。 - fsm是業務狀態機,其實現類是
ServiceStateMachine
,其行為handler針對Leader和follower分別有leaderProcessListener 和 followerProcessListener。 - boltServer 是Blot服務。因為JRraft基於bolt,所以設定了
RaftServerHandler
和RaftServerConnectionHandler
。
具體類實現如下:
public class RaftServer {
private RaftGroupService raftGroupService; // jraft 服務端服務框架
private Node node; // raft 節點
private ServiceStateMachine fsm; // 業務狀態機
private PeerId serverId;
private Configuration initConf;
private String groupId;
private String dataPath;
private List<ChannelHandler> serverHandlers = new ArrayList<>();
private LeaderProcessListener leaderProcessListener;
private FollowerProcessListener followerProcessListener;
private BoltServer boltServer;
public void start(RaftServerConfig raftServerConfig) throws IOException {
FileUtils.forceMkdir(new File(dataPath));
// 構建服務端,設定handler
serverHandlers.add(new RaftServerHandler(this));
serverHandlers.add(new RaftServerConnectionHandler());
boltServer = new BoltServer(new URL(NetUtil.getLocalAddress().getHostAddress(),
serverId.getPort()), serverHandlers);
// 啟動服務端
boltServer.initServer();
RpcServer rpcServer = boltServer.getRpcServer();
RaftRpcServerFactory.addRaftRequestProcessors(rpcServer);
// 設定狀態機的handler
this.fsm = ServiceStateMachine.getInstance();
this.fsm.setLeaderProcessListener(leaderProcessListener);
this.fsm.setFollowerProcessListener(followerProcessListener);
NodeOptions nodeOptions = initNodeOptions(raftServerConfig);
this.raftGroupService = new RaftGroupService(groupId, serverId, nodeOptions, rpcServer);
//start
this.node = this.raftGroupService.start();
// 啟動客戶端
RpcClient raftClient = ((AbstractBoltClientService) (((NodeImpl) node).getRpcService()))
.getRpcClient();
NotifyLeaderChangeHandler notifyLeaderChangeHandler = new NotifyLeaderChangeHandler(
groupId, null);
raftClient.registerUserProcessor(new SyncUserProcessorAdapter(notifyLeaderChangeHandler));
}
}
2.2.1 RaftServerHandler
RaftServerHandler是服務端相應handler,首先接受Bolt訊息,然後轉化成 processRequest,傳送給node。
received:84, RaftServerHandler (com.alipay.sofa.registry.jraft.handler)
handleRequest:55, AsyncUserProcessorAdapter (com.alipay.sofa.registry.remoting.bolt)
dispatchToUserProcessor:224, RpcRequestProcessor (com.alipay.remoting.rpc.protocol)
doProcess:145, RpcRequestProcessor (com.alipay.remoting.rpc.protocol)
run:366, RpcRequestProcessor$ProcessTask (com.alipay.remoting.rpc.protocol)
runWorker:1149, ThreadPoolExecutor (java.util.concurrent)
run:624, ThreadPoolExecutor$Worker (java.util.concurrent)
run:748, Thread (java.lang)
RaftServerHandler 會根據本身是Leader還是Follower做不同處理。
然後會在createTask之中進行 Hessian協議 處理,進而呼叫 raftServer.getNode().apply(task);
大致邏輯如下:
- 從訊息中解析出請求;
- 根據請求解析出對應的處理函式;
- 如果就是簡單讀取,就直接處理,然後返回;
- 如果需要task處理,就生成closure;
- 生成處理closure的task;
- 執行task;
具體程式碼如下:
public class RaftServerHandler implements ChannelHandler {
protected RaftServer raftServer;
@Override
public void received(Channel channel, Object message) throws RemotingException {
BoltChannel boltChannel = (BoltChannel) channel;
AsyncContext asyncContext = boltChannel.getAsyncContext();
if (!raftServer.getFsm().isLeader()) {
asyncContext.sendResponse(ProcessResponse.redirect(raftServer.redirect()).build());
return;
}
// 從訊息中解析出請求
ProcessRequest processRequest = (ProcessRequest) message;
long start = System.currentTimeMillis();
// 根據請求解析出對應的處理函式
Method method = Processor.getInstance().getWorkMethod(processRequest);
if (Processor.getInstance().isLeaderReadMethod(method)) {
// 如果就是簡單讀取,就直接處理,然後返回
Object obj = Processor.getInstance().process(method, processRequest);
long cost = System.currentTimeMillis() - start;
asyncContext.sendResponse(obj);
} else {
// 如果需要task處理,就生成closure
LeaderTaskClosure closure = new LeaderTaskClosure();
closure.setRequest(processRequest);
closure.setDone(status -> {
long cost = System.currentTimeMillis() - start;
if (status.isOk()) {
asyncContext.sendResponse(closure.getResponse());
} else {
asyncContext.sendResponse(ProcessResponse.fail(status.getErrorMsg()).build());
}
});
// 生成處理closure的task
Task task = createTask(closure, processRequest);
// 執行task
raftServer.getNode().apply(task);
}
}
}
2.2.2 ServiceStateMachine
ServiceStateMachine 是服務端的狀態機,MetaServer這裡主要是實現核心的 onApply(iterator)
方法,應用使用者提交的請求到Processor處理。
關於快照的部分我們會在後續講解。
public class ServiceStateMachine extends StateMachineAdapter {;
private LeaderProcessListener leaderProcessListener;
private FollowerProcessListener followerProcessListener;
private static volatile ServiceStateMachine instance;
@Override
public void onApply(Iterator iter) {
while (iter.hasNext()) {
Closure done = iter.done();
ByteBuffer data = iter.getData();
ProcessRequest request;
LeaderTaskClosure closure = null;
if (done != null) {
closure = (LeaderTaskClosure) done;
request = closure.getRequest();
} else {
Hessian2Input input = new Hessian2Input(new ByteArrayInputStream(data.array()));
SerializerFactory serializerFactory = new SerializerFactory();
input.setSerializerFactory(serializerFactory);
request = (ProcessRequest) input.readObject();
input.close();
}
ProcessResponse response = Processor.getInstance().process(request);
if (closure != null) {
closure.setResponse(response);
closure.run(Status.OK());
}
iter.next();
}
}
}
2.3 RaftClient
客戶端 Client 比較簡單,主要使用 jraft 提供的 RouteTable
來重新整理獲取最新的 leader 節點,然後傳送請求到 leader節點。
public class RaftClient {
private BoltCliClientService cliClientService;
private RpcClient rpcClient;
private CliOptions cliOptions;
private String groupId;
private Configuration conf;
}
0x03 相關配置
JRaft相關配置主要是在 MetaServerRepositoryConfiguration 之中完成的。
因為各種節點列表是儲存在Repository之中,而Repository是由JRaft來保證資料一致性,所以配置中主要是和Repository相關,比如三個RepositoryService。
- dataRepositoryService
- metaRepositoryService
- sessionRepositoryService
其次是Session版本服務和兩個Confirm服務
- SessionVersionRepositoryService
- DataConfirmStatusService
- SessionConfirmStatusService
然後是RaftExchanger,這是一個網路互動的抽象。
最後是RaftAnnotationBeanPostProcessor,這是用來在執行時候處理Bean。
程式碼如下:
@Configuration
public static class MetaServerRepositoryConfiguration {
@Bean
public RepositoryService dataRepositoryService() {
return new DataRepositoryService();
}
@Bean
public RepositoryService metaRepositoryService() {
return new MetaRepositoryService();
}
@Bean
public RepositoryService sessionRepositoryService() {
return new SessionRepositoryService();
}
@Bean
public VersionRepositoryService sessionVersionRepositoryService() {
return new SessionVersionRepositoryService();
}
@Bean
public NodeConfirmStatusService dataConfirmStatusService() {
return new DataConfirmStatusService();
}
@Bean
public NodeConfirmStatusService sessionConfirmStatusService() {
return new SessionConfirmStatusService();
}
@Bean
public RaftExchanger raftExchanger() {
return new RaftExchanger();
}
@Bean
public RaftAnnotationBeanPostProcessor raftAnnotationBeanPostProcessor() {
return new RaftAnnotationBeanPostProcessor();
}
}
另外,MetaDBConfiguration 也實現了一個Bean。
@Configuration
public static class MetaDBConfiguration {
@Bean
public DBService persistenceDataDBService() {
return new PersistenceDataDBService();
}
}
3.1 RepositoryService介面
因為Raft主要作用於RepositoryService介面,所以首先講解RepositoryService介面。
針對Repository所有的操作都是直接呼叫的 RepositoryService
等介面,DataRepositoryService 等類實現了這個介面。
@RaftService(uniqueId = "dataServer")
public class DataRepositoryService extends AbstractSnapshotProcess
implements
RepositoryService<String, RenewDecorate<DataNode>> {
}
比如 DataStoreService 就會直接呼叫 dataRepositoryService進行各種操作。
public class DataStoreService implements StoreService<DataNode> {
@RaftReference(uniqueId = "dataServer")
private RepositoryService<String, RenewDecorate<DataNode>> dataRepositoryService;
......
dataRepositoryService.replaceAll(dataCenter, dataCenterNodesMapTemp, version);
......
}
3.2 RaftReference & RaftService
這兩個註解可以認為是封裝好Raft的從而呈現給Registry的介面。RaftReference 對應了客戶端代理,RaftService對應著服務端的實現。
為什麼要這麼做?因為需要維護資料一致性,所以必須把單純的本地呼叫轉換為非同步網路呼叫,這樣才能用raft協議保證資料一致性。
3.2.1 註解定義
RaftService定義如下:
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
public @interface RaftService {
Class<?> interfaceType() default void.class;
String uniqueId() default "";
}
RaftReference定義如下:
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.FIELD)
public @interface RaftReference {
Class<?> interfaceType() default void.class;
String uniqueId() default "";
}
3.2.2 註解使用
凡是需要由Raft控制的服務都加上了RaftService這個註解。
- dataRepositoryService
- metaRepositoryService
- sessionRepositoryService
- SessionVersionRepositoryService
- DataConfirmStatusService
- SessionConfirmStatusService
- PersistenceDataDBService
凡是 RaftService
的具體相關實現類都加了 @RaftReference
註解,因為根據id進行區分,所以有些服務設定了uniqueId。
@RaftReference
private DBService persistenceDataDBService;
@RaftReference(uniqueId = "dataServer")
private RepositoryService<String, RenewDecorate<DataNode>> dataRepositoryService;
@RaftReference(uniqueId = "dataServer")
private NodeConfirmStatusService<DataNode> dataConfirmStatusService;
@RaftReference(uniqueId = "metaServer")
private RepositoryService<String, RenewDecorate<MetaNode>> metaRepositoryService;
@RaftReference(uniqueId = "sessionServer")
private RepositoryService<String, RenewDecorate<SessionNode>> sessionRepositoryService;
@RaftReference(uniqueId = "sessionServer")
private VersionRepositoryService<String> sessionVersionRepositoryService;
@RaftReference(uniqueId = "sessionServer")
private NodeConfirmStatusService<SessionNode> sessionConfirmStatusService;
3.2.3 註解實現
RaftAnnotationBeanPostProcessor 是 BeanPostProcessor 的實現,在這裡就對 RaftReference & RaftService
這兩個註解進行了處理。
BeanPostProcessor介面作用如下:如果我們想在Spring容器中完成bean例項化、配置以及其他初始化方法前後要新增一些自己邏輯處理。我們需要定義一個或多個BeanPostProcessor介面實現類,然後註冊到Spring IoC容器中。
public class RaftAnnotationBeanPostProcessor implements BeanPostProcessor, Ordered {
@Autowired
private RaftExchanger raftExchanger;
@Override
public Object postProcessBeforeInitialization(Object bean, String beanName) {
processRaftReference(bean);
return bean;
}
@Override
public Object postProcessAfterInitialization(Object bean, String beanName) {
processRaftService(bean, beanName);
return bean;
}
}
對於兩個註解,有不同的處理方式。
3.2.3.1 客戶端processRaftReference
針對processRaftReference的處理就是:把加了 @RaftReference
註解的屬性替換成動態代理,進而替換成客戶端呼叫。即在 processRaftReference
方法中,凡是加了 @RaftReference
註解的屬性,都會被動態代理類替換,其代理實現見 ProxyHandler
類,即將方法呼叫,封裝為 ProcessRequest
,通過 RaftClient 傳送給 RaftServer。
private void processRaftReference(Object bean) {
final Class<?> beanClass = bean.getClass();
ReflectionUtils.doWithFields(beanClass, field -> {
RaftReference referenceAnnotation = field.getAnnotation(RaftReference.class);
Class<?> interfaceType = referenceAnnotation.interfaceType();
String serviceId = getServiceId(interfaceType, referenceAnnotation.uniqueId());
Object proxy = getProxy(interfaceType, serviceId);
ReflectionUtils.makeAccessible(field);
ReflectionUtils.setField(field, bean, proxy); // 設定代理
}, field -> !Modifier.isStatic(field.getModifiers())
&& field.isAnnotationPresent(RaftReference.class));
}
private Object getProxy(Class<?> interfaceType, String serviceId) {
RaftClient client = raftExchanger.getRaftClient();
return Proxy.newProxyInstance(Thread.currentThread().getContextClassLoader(),
new Class<?>[] { interfaceType }, new ProxyHandler(interfaceType, serviceId,
raftExchanger.getRaftClient()));
}
field = {Field@3824} "private com.alipay.sofa.registry.store.api.DBService com.alipay.sofa.registry.server.meta.remoting.handler.FetchProvideDataRequestHandler.persistenceDataDBService"
referenceAnnotation = {$Proxy42@3825} "@com.alipay.sofa.registry.store.api.annotation.RaftReference(interfaceType=void, uniqueId=)"
interfaceType = {Class@3150} "interface com.alipay.sofa.registry.store.api.DBService"
serviceId = "com.alipay.sofa.registry.store.api.DBService"
假設 DataStoreService ,在註解處理之前是:
bean = {DataStoreService@4053}
dataRepositoryService = null
即
+-----------------------------+
| DataStoreService |
| |
| +-----------------------+ |
| | dataRepositoryService +---------> Null
| +-----------------------+ |
+-----------------------------+
註解處理之後是
bean = {DataStoreService@4053}
dataRepositoryService = {$Proxy46@4057} Method threw 'java.lang.RuntimeException' exception. Cannot evaluate com.sun.proxy.$Proxy46.toString()
proxy = {$Proxy46@4057} Method threw 'java.lang.RuntimeException' exception. Cannot evaluate com.sun.proxy.$Proxy46.toString()
h = {ProxyHandler@4098}
interfaceType = {Class@4018} "interface com.alipay.sofa.registry.server.meta.repository.RepositoryService"
serviceId = "com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer"
client = {RaftClient@4077}
cliClientService = {BoltCliClientService@4144}
rpcClient = {RpcClient@4145}
cliOptions = {CliOptions@4146} "RpcOptions{rpcConnectTimeoutMs=1000, rpcDefaultTimeout=5000, rpcInstallSnapshotTimeout=300000, rpcProcessorThreadPoolSize=80, enableRpcChecksum=false, metricRegistry=null}"
groupId = "RegistryGroup_DefaultDataCenter"
conf = {Configuration@4148} "192.168.1.2:9614"
started = {AtomicBoolean@4149} "true"
即如下圖
+-----------------------------+
| DataStoreService | +-------------------+
| | | ProxyHandler |
| +-----------------------+ | +-----+ | +---------------+ |
| | dataRepositoryService +--------->+Proxy+--->+ | interfaceType | |
| +-----------------------+ | +-----+ | | | |
+-----------------------------+ | | serviceId | |
| | | |
| | RpcClient | |
| +---------------+ |
+-------------------+
這樣就被動態轉移到了ProxyHandler,這樣如果呼叫成員函式,就會通過rpc進行呼叫。
3.2.3.2 服務端processRaftService
針對processRaftService的處理就是,把加了 @RaftService
的類對映成 Processor
類,進而實現為SOFAJRaft 的狀態機 ServiceStateMachine
。即被加了 @RaftService
的類會被新增到 Processor
類 中,通過 serviceId
(interfaceName + uniqueId) 進行區分。
RaftServer 收到請求後,會把它生效到 SOFAJRaft 的狀態機,具體實現類為 ServiceStateMachine
,即會呼叫 Processor
方法,通過 serviceId 找到這個實現類,執行對應的方法呼叫。
private void processRaftService(Object bean, String beanName) {
final Class<?> beanClass = AopProxyUtils.ultimateTargetClass(bean);
RaftService raftServiceAnnotation = beanClass.getAnnotation(RaftService.class);
Class<?> interfaceType = raftServiceAnnotation.interfaceType();
String serviceUniqueId = getServiceId(interfaceType, raftServiceAnnotation.uniqueId());
Processor.getInstance().addWorker(serviceUniqueId, interfaceType, bean);
}
其中部分變數如下:
bean = {DataRepositoryService@3805}
beanName = "dataRepositoryService"
beanClass = {Class@3796} "class com.alipay.sofa.registry.server.meta.repository.service.DataRepositoryService"
raftServiceAnnotation = {$Proxy41@3807} "@com.alipay.sofa.registry.store.api.annotation.RaftService(interfaceType=void, uniqueId=dataServer)"
interfaceType = {Class@3795} "interface com.alipay.sofa.registry.server.meta.repository.RepositoryService"
serviceUniqueId = "com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer"
在處理註解時候,通過 addWorker 來把類和成員變數設定到map中。注意 workerMethods 是一個雙層HashMap,第一層是以服務名為key,value是一個HashMap,第二層是以函式名為key,具體函式為value。
public void addWorker(String serviceId, Class interfaceClazz, Object target) {
Map<String, Method> publicMethods = new HashMap();
for (Method m : interfaceClazz.getMethods()) {
StringBuilder mSigs = new StringBuilder();
mSigs.append(m.getName());
for (Class<?> paramType : m.getParameterTypes()) {
mSigs.append(paramType.getName());
}
publicMethods.put(mSigs.toString(), m);
}
workerMethods.put(serviceId, publicMethods);
workers.put(serviceId, target);
}
serviceId = "com.alipay.sofa.registry.store.api.DBService"
interfaceClazz = {Class@3118} "interface com.alipay.sofa.registry.store.api.DBService"
target = {PersistenceDataDBService@3124}
this = {Processor@3812}
workerMethods = {HashMap@3815} size = 2
"com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer" -> {HashMap@3813} size = 13
key = "com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer"
value = {HashMap@3813} size = 13
"getNodeRepositories" -> {Method@3856} "public abstract java.util.Map com.alipay.sofa.registry.server.meta.repository.RepositoryService.getNodeRepositories()"
"replaceAlljava.lang.Stringjava.util.Mapjava.lang.Long" -> {Method@3858} "public abstract java.util.Map com.alipay.sofa.registry.server.meta.repository.RepositoryService.replaceAll(java.lang.String,java.util.Map,java.lang.Long)"
"checkVersionjava.lang.Objectjava.lang.Long" -> {Method@3860} "public abstract boolean com.alipay.sofa.registry.server.meta.repository.RepositoryService.checkVersion(java.lang.Object,java.lang.Long)"
"replacejava.lang.Objectjava.lang.Object" -> {Method@3862} "public default java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.replace(java.lang.Object,java.lang.Object)"
"removejava.lang.Object" -> {Method@3864} "public default java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.remove(java.lang.Object)"
"putjava.lang.Objectjava.lang.Objectjava.lang.Long" -> {Method@3866} "public abstract java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.put(java.lang.Object,java.lang.Object,java.lang.Long)"
"getVersionjava.lang.Object" -> {Method@3868} "public abstract java.lang.Long com.alipay.sofa.registry.server.meta.repository.RepositoryService.getVersion(java.lang.Object)"
"putjava.lang.Objectjava.lang.Object" -> {Method@3870} "public default java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.put(java.lang.Object,java.lang.Object)"
"getAllData" -> {Method@3872} "public abstract java.util.Map com.alipay.sofa.registry.server.meta.repository.RepositoryService.getAllData()"
"removejava.lang.Objectjava.lang.Long" -> {Method@3874} "public abstract java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.remove(java.lang.Object,java.lang.Long)"
"getjava.lang.Object" -> {Method@3876} "public abstract java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.get(java.lang.Object)"
"getAllDataMap" -> {Method@3878} "public abstract java.util.Map com.alipay.sofa.registry.server.meta.repository.RepositoryService.getAllDataMap()"
"replacejava.lang.Objectjava.lang.Objectjava.lang.Long" -> {Method@3880} "public abstract java.lang.Object com.alipay.sofa.registry.server.meta.repository.RepositoryService.replace(java.lang.Object,java.lang.Object,java.lang.Long)"
"com.alipay.sofa.registry.store.api.DBService" -> {HashMap@3828} size = 5
workers = {HashMap@3814} size = 2
"com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer" -> {DataRepositoryService@3805}
key = "com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer"
value = {DataRepositoryService@3805}
"com.alipay.sofa.registry.store.api.DBService" -> {PersistenceDataDBService@3117}
key = "com.alipay.sofa.registry.store.api.DBService"
value = {PersistenceDataDBService@3117}
methodHandleMap = {ConcurrentHashMap@3817} size = 0
即如下圖所示:
+-------------------+
| Processor | +> DataRepositoryService +-> getNodeRepositories +-> getNodeRepositories()
| +---------------+ | | |
| | workers+--------------->+> PersistenceDataDBService +-> replaceAll+--> replaceAll
| | | | |
+-----------------------+ addWorker | | | | {HashMap} +-> checkVersion +--> ...
| DataRepositoryService +-------------->+ | {HashMap} | | +---> +----------------------------+ |
+-----------------------+ | | workerMethods +-------->+ |RepositoryService:dataServer+-->-->-replace +-----> ...
| +---------------+ | | +----------------------------+ |
+-------------------+ | +-> remove +----> ...
| |
+---> ...... +-> put +-----> ...
| |
| +-> getAllDataMap +----> ...
| |
| {HashMap} +---> ... +-> ......
| +---------+ |
+---> |DBService+-----------> ...
+---------+ |
|
+---> ...
手機上參見如下:
其呼叫棧如下:
addWorker:69, Processor (com.alipay.sofa.registry.jraft.processor)
processRaftService:123, RaftAnnotationBeanPostProcessor (com.alipay.sofa.registry.server.meta.repository.annotation)
postProcessAfterInitialization:60, RaftAnnotationBeanPostProcessor (com.alipay.sofa.registry.server.meta.repository.annotation)
applyBeanPostProcessorsAfterInitialization:421, AbstractAutowireCapableBeanFactory (org.springframework.beans.factory.support)
initializeBean:1635, AbstractAutowireCapableBeanFactory (org.springframework.beans.factory.support)
doCreateBean:553, AbstractAutowireCapableBeanFactory
以上過程其實和 RPC 呼叫非常類似,在引用方發起的方法呼叫,並不會真正的執行方法,而是封裝成請求傳送到 Raft 服務,由 Raft 狀態機進行真正的方法呼叫,比如把節點資訊儲存到 Map 中。所有節點之間的資料一致由Raft協議進行保證。當然如果本機就是主節點, 對於一些查詢請求不需要走Raft協議而直接呼叫本地實現方法。
0x04 網路互動
當Registry需要進行業務呼叫時候,就會隱形使用Raft。
比如 DataStoreService 會進行如下呼叫:
Map<String/*dataCenter*/, NodeRepository> dataNodeRepositoryMap = dataRepositoryService
.getNodeRepositories();
getNodeRepositories會使用 Proxy 呼叫到 ProxyHandler # invoke。
public class ProxyHandler implements InvocationHandler {
private final Class<?> interfaceType;
private final String serviceId;
private final RaftClient client;
@Override
public Object invoke(Object proxy, Method method, Object[] args) {
try {
ProcessRequest request = new ProcessRequest();
request.setMethodArgSigs(createParamSignature(method.getParameterTypes()));
request.setMethodName(method.getName());
request.setMethodArgs(args);
request.setServiceName(serviceId);
if (Processor.getInstance().isLeaderReadMethod(method)) {
return doInvokeMethod(request); // 如果本身就是leader,則直接呼叫JVM函式
}
return client.sendRequest(request); // 否則發起client呼叫
}
}
}
其呼叫棧如下:
invoke:69, ProxyHandler (com.alipay.sofa.registry.jraft.processor)
getNodeRepositories:-1, $Proxy46 (com.sun.proxy)
getNodeChangeResult:238, DataStoreService (com.alipay.sofa.registry.server.meta.store)
getAllNodes:96, MetaServerRegistry (com.alipay.sofa.registry.server.meta.registry)
getRegisterNodeByType:81, MetaDigestResource (com.alipay.sofa.registry.server.meta.resource)
lambda$init$1:70, MetaDigestResource (com.alipay.sofa.registry.server.meta.resource)
然後在 RaftClient # sendRequest 中有對 Raft 的進一步呼叫
public Object sendRequest(ProcessRequest request) {
try {
PeerId peer = getLeader();
Object response = this.rpcClient.invokeSync(peer.getEndpoint().toString(), request,
cliOptions.getRpcDefaultTimeout());
ProcessResponse cmd = (ProcessResponse) response;
if (cmd.getSuccess()) {
return cmd.getEntity();
}
}
}
當在服務端,呼叫棧如下
process:123, Processor (com.alipay.sofa.registry.jraft.processor)
onApply:133, ServiceStateMachine (com.alipay.sofa.registry.jraft.bootstrap)
doApplyTasks:534, FSMCallerImpl (com.alipay.sofa.jraft.core)
doCommitted:503, FSMCallerImpl (com.alipay.sofa.jraft.core)
runApplyTask:431, FSMCallerImpl (com.alipay.sofa.jraft.core)
access$100:72, FSMCallerImpl (com.alipay.sofa.jraft.core)
onEvent:147, FSMCallerImpl$ApplyTaskHandler (com.alipay.sofa.jraft.core)
onEvent:141, FSMCallerImpl$ApplyTaskHandler (com.alipay.sofa.jraft.core)
run:137, BatchEventProcessor (com.lmax.disruptor)
run:748, Thread (java.lang)
最後圖例如下:
+---------------------------------------+ +---------------------------------------------+
| +------------------------------+ | | +----------------------------------+ |
| | +----------------+ registry |Client| | Server| +----------------------+registry | |
| | |DataStoreService| | | | | | DataRepositoryService| | |
| | +-----+----------+ | | | | +---------+------------+ | |
| | | getNodeRepositories | | | | ^ getNodeRepositories | |
| | | | | | | | | |
| | v | | | | +------+----+ | |
| | +-----+-----------------+ | | | | | Processor | | |
| | |DataRepositoryService | | | | | +------+----+ | |
| | +-----+-----------------+ | | | | ^ onApply | |
| | | | | | | | | |
| | v | | | | +-------+------+ | |
| | +-+---+ | | | | | StateMachine | | |
| | |Proxy| | | | | +-------+------+ | |
| | +-+---+ | | | | ^ process | |
| | | invoke | | | | | | |
| | v | | | | | | |
| | +----+-------+ | | | | +------+------+ | |
| | |ProxyHandler| | | | | |FSMCallerImpl| | |
| | +----+-------+ | | | | +------+------+ | |
| | | sendRequest | | | | ^ | |
| | v | | | | | received | |
| | +---+------+ | | | | | | |
| | |RaftClient| | | | | +-----------------+ | |
| | +---+------+ | | | | |RaftServerHandler| | |
| | | invokeSync | | | | +-----------------+ | |
| +------------------------------+ | | +----------------------------------+ |
| | | | | |
| | | | | |
| +------------------------------+ | | +--------------------------+ |
| | | remoting.rpc | | | | | remoting.rpc| |
| | +----v------+ | bolt | Network| | +-------+-----------+ | |
| | | RpcClient | | +---------------------> | |RpcRequestProcessor| | |
| | +-----------+ | | | | +-------------------+ | |
| +------------------------------+ | | +--------------------------+ |
+---------------------------------------+ +---------------------------------------------+
在手機上如圖
0x05 快照儲存
首先我們需要看看為什麼要有快照機制。
5.1 儲存模組
SOFAJRaft 儲存模組分為:
- Log 儲存記錄 Raft 配置變更和使用者提交任務日誌;
- Meta 儲存即元資訊儲存記錄 Raft 實現的內部狀態;
- Snapshot 儲存用於存放使用者的狀態機 Snapshot 及元資訊,Snapshot 是快照,是對資料當前值的一個記錄;
5.2 問題
當 Raft 節點 Node 重啟時,記憶體中狀態機的狀態資料丟失,觸發啟動過程重新存放日誌儲存 LogStorage 的所有日誌重建整個狀態機例項,此種場景會導致三個問題:
- 如果任務提交比較頻繁,例如訊息中介軟體場景導致整個重建過程很長啟動緩慢;
- 如果日誌非常多並且節點需要儲存所有的日誌,對儲存來說是資源佔用不可持續;
- 如果增加 Node 節點,新節點需要從 Leader 獲取所有的日誌重新存放至狀態機,對於 Leader 和網路頻寬都是不小的負擔。
5.3 Snapshot 機制
因此通過引入 Snapshot 機制來解決此三個問題。
所謂快照 Snapshot 即對資料當前值的記錄,是為當前狀態機的最新狀態構建”映象”單獨儲存,儲存成功刪除此時刻之前的日誌減少日誌儲存佔用;啟動的時候直接載入最新的 Snapshot 映象,然後重放在此之後的日誌即可,如果 Snapshot 間隔合理,整個重放到狀態機過程較快,加速啟動過程。最後新節點的加入先從 Leader 拷貝最新的 Snapshot 安裝到本地狀態機,然後只要拷貝後續的日誌即可,能夠快速跟上整個 Raft Group 的進度。
Leader 生成快照有幾個作用:
- 當有新的節點 Node 加入叢集不用只靠日誌複製、回放機制和 Leader 保持資料一致,通過安裝 Leader 的快照方式跳過早期大量日誌的回放;
- Leader 用快照替代 Log 複製減少網路端的資料量;
- 用快照替代早期的 Log 節省儲存佔用空間。
5.4 ServiceStateMachine
在狀態機層面上來說,使用 snapshot 機制,也就是為狀態機做一個 checkpoint,儲存當時狀態機的狀態,刪除在此之前的所有日誌,核心是實現 StateMachine的兩個方法:
onSnapshotLoad
,啟動或者安裝 snapshot 後載入 snapshot;onSnapshotSave
,定期儲存 snapshot;
從具體使用 Raft 的相關服務層面來說,每個服務提供了自己不同的業務實現。
- 在Registry處理註解時候,會通過 addWorker 來把使用 Raft 的相關服務類和成員變數設定到map中;
- 於是在狀態機呼叫快照相關函式時候,狀態機會遍歷 Processor.getInstance().getWorkers() ,從而呼叫每個類的具體處理函式;
具體work變數如下:
workers = {HashMap@6176} size = 7
"com.alipay.sofa.registry.server.meta.repository.RepositoryService:metaServer" -> {MetaRepositoryService@6201}
"com.alipay.sofa.registry.server.meta.repository.RepositoryService:dataServer" -> {DataRepositoryService@6162}
"com.alipay.sofa.registry.store.api.DBService" -> {PersistenceDataDBService@6203}
"com.alipay.sofa.registry.server.meta.repository.RepositoryService:sessionServer" -> {SessionRepositoryService@6205}
"com.alipay.sofa.registry.server.meta.repository.VersionRepositoryService:sessionServer" -> {SessionVersionRepositoryService@6207}
"com.alipay.sofa.registry.server.meta.repository.NodeConfirmStatusService:sessionServer" -> {SessionConfirmStatusService@6209}
"com.alipay.sofa.registry.server.meta.repository.NodeConfirmStatusService:dataServer" -> {DataConfirmStatusService@6211}
狀態機具體程式碼如下:
public class ServiceStateMachine extends StateMachineAdapter {
@Override
public void onSnapshotSave(final SnapshotWriter writer, final Closure done) {
Map<String, Object> workers = Processor.getInstance().getWorkers();
Map<String, SnapshotProcess> snapshotProcessors = new HashMap<>();
if (workers != null) {
// 遍歷
workers.forEach((serviceId, worker) -> {
if (worker instanceof SnapshotProcess) {
SnapshotProcess snapshotProcessor = (SnapshotProcess) worker;
snapshotProcessors.put(serviceId, snapshotProcessor.copy());
}
});
}
Utils.runInThread(() -> {
String errors = null;
outer:
// 遍歷
for (Map.Entry<String, SnapshotProcess> entry : snapshotProcessors.entrySet()) {
String serviceId = entry.getKey();
SnapshotProcess snapshotProcessor = entry.getValue();
Set<String> fileNames = snapshotProcessor.getSnapshotFileNames();
for (String fileName : fileNames) {
String savePath = writer.getPath() + File.separator + fileName;
boolean ret = snapshotProcessor.save(savePath); // 呼叫具體實現
if (ret) {
if (!writer.addFile(fileName)) {
break outer;
}
} else {
break outer;
}
}
}
if (errors != null) {
done.run(new Status(RaftError.EIO, errors));
} else {
done.run(Status.OK());
}
});
}
@Override
public boolean onSnapshotLoad(SnapshotReader reader) {
List<String> failServices = new ArrayList<>();
Map<String, Object> workers = Processor.getInstance().getWorkers();
if (workers != null) {
// 遍歷
outer: for (Map.Entry<String, Object> entry : workers.entrySet()) {
String serviceId = entry.getKey();
Object worker = entry.getValue();
if (worker instanceof SnapshotProcess) {
SnapshotProcess snapshotProcess = (SnapshotProcess) worker;
Set<String> fileNames = snapshotProcess.getSnapshotFileNames();
for (String fileName : fileNames) {
if (reader.getFileMeta(fileName) == null) {
failServices.add(serviceId);
break outer;
}
String savePath = reader.getPath() + File.separator + fileName;
boolean ret = snapshotProcess.load(savePath); // 呼叫具體實現
if (!ret) {
failServices.add(serviceId);
break outer;
}
}
}
}
}
return true;
}
}
5.5 XXXRepositoryService
關於具體服務,我們可以參見XXXRepositoryService。
在ServiceStateMachine中,會用 snapshotProcess.load(savePath); 呼叫具體服務的特殊實現,這從呼叫棧中可以清晰見到。
load:317, DataRepositoryService (com.alipay.sofa.registry.server.meta.repository.service)
onSnapshotLoad:212, ServiceStateMachine (com.alipay.sofa.registry.jraft.bootstrap)
doSnapshotLoad:641, FSMCallerImpl (com.alipay.sofa.jraft.core)
runApplyTask:389, FSMCallerImpl (com.alipay.sofa.jraft.core)
access$100:72, FSMCallerImpl (com.alipay.sofa.jraft.core)
onEvent:147, FSMCallerImpl$ApplyTaskHandler (com.alipay.sofa.jraft.core)
onEvent:141, FSMCallerImpl$ApplyTaskHandler (com.alipay.sofa.jraft.core)
run:137, BatchEventProcessor (com.lmax.disruptor)
run:748, Thread (java.lang)
以DataRepositoryService為例,其基類AbstractSnapshotProcess做了一些基礎實現。
public abstract class AbstractSnapshotProcess implements SnapshotProcess {
public boolean save(String path, Object values) {
FileUtils.writeByteArrayToFile(new File(path), CommandCodec.encodeCommand(values),
false);
return true;
}
public <T> T load(String path, Class<T> clazz) throws IOException {
byte[] bs = FileUtils.readFileToByteArray(new File(path));
if (bs != null && bs.length > 0) {
return CommandCodec.decodeCommand(bs, clazz);
}
}
}
在DataRepositoryService之中,又對load做了一些適配。
@RaftService(uniqueId = "dataServer")
public class DataRepositoryService extends AbstractSnapshotProcess
implements
RepositoryService<String, RenewDecorate<DataNode>> {
@Override
public boolean save(String path) {
return save(path, registry);
}
@Override
public synchronized boolean load(String path) {
Map<String, NodeRepository> map = load(path, registry.getClass());
registry.clear();
registry.putAll(map);
return true;
}
}
0xFF 參考
服務註冊中心 MetaServer 功能介紹和實現剖析 | SOFARegistry 解析
服務註冊中心如何實現 DataServer 平滑擴縮容 | SOFARegistry 解析
服務註冊中心資料一致性方案分析 | SOFARegistry 解析
服務註冊中心如何實現秒級服務上下線通知 | SOFARegistry 解析
服務註冊中心 Session 儲存策略 | SOFARegistry 解析
服務註冊中心資料分片和同步方案詳解 | SOFARegistry 解析
服務註冊中心 SOFARegistry 解析 | 服務發現優化之路
海量資料下的註冊中心 - SOFARegistry 架構介紹
詳解螞蟻金服 SOFAJRaft | 生產級高效能 Java 實現
怎樣打造一個分散式資料庫——rocksDB, raft, mvcc,本質上是為了解決跨資料中心的複製