異構資料來源資料同步 → 從原始碼分析 DataX 敏感資訊的加解密

青石路發表於2024-07-15

開心一刻

出門扔垃圾,看到一大爺摔地上了

過去問大爺:我賬戶餘額 0.8,能扶你起來不

大爺往旁邊挪了挪

跟我說到:孩子,快,你也躺下,這個來錢快!

我沒理大爺,徑直去扔了垃圾

然後飛速的躺在了大爺旁邊,說道:感覺大爺帶飛!

撞死碰瓷鬼

書接上回

透過 異構資料來源同步之資料同步 → DataX 使用細節,相信大家都知道如何使用 DataX

但你們有沒有發現一個問題:job.jsonreaderwriter 的賬密都是明文

賬密明文

這就猶如在裸奔,是有安全隱患的!

不僅你們喜歡看裸奔,其實我也喜歡看裸奔

但不管是從法律的角度,還是從道德的角度,裸奔都是不允許的!

所以我們應該怎麼辦,給她穿上衣服?

少管閒事

那就穿嘛,而且給她穿厚點,讓她安安全全的!

加密

首先我們得明確,目前 DataX 是支援加解密的,dataxPluginDev.md 有這樣一段說明

DataX框架支援對特定的配置項進行RSA加密,例子中以 * 開頭的專案便是加密後的值。 配置項加密解密過程對外掛是透明,外掛仍然以不帶 * 的key來查詢配置和操作配置項

從這段話我們可以解析出以下幾點資訊

  1. 採用 RSA 演算法進行加密,且暫時只支援這一種!

  2. 敏感資訊配置項的 key* 開頭,例如

    {
      "job": {
        "content": [
          {
            "reader": {
              "name": "oraclereader",
              "parameter": {
                "*username": "加密後的密文",
                "*password": "加密後的密文",
                ...
              }
            },
            "writer": {
              "name": "oraclewriter",
              "parameter": {
                "*username": "加密後的密文",
                "*password": "加密後的密文",
                ...
              }
            }
          }
        ]
      }
    }
    
  3. plugin 不參與加密解密,言外之意就是 FrameWork 負責解密,至於加密嘛,你們先想想

    異構資料來源資料同步 → 從原始碼分析 DataX 敏感資訊的加解密

除了以上 3 點,你們還能分析出什麼?

  1. 如何獲得明文的密文
  2. 配置了密文,需不需要透過額外的配置告知 DataX 需要解密

這兩點能分析出來嗎?

關於第 1 點,我把 DataX 的文件翻遍了,沒找到給明文加密的說明,莫非就用通用的 RSA 工具加密就行?

關於第 2 點,這個暫時不得而知,但是我們可以去試

獲取密文

DataX 只說支援 RSA 加密,但沒說如何獲取密文,但我們仔細想一下,其實是能找到切入點的。DataX 肯定有解密過程,而解密與加密往往是成對存在的,找到了解密方法也就找到了加密方法,那上哪去找解密方法了?原始碼 肯定是最根本的方式!

原始碼之下無密碼

前面已經說過了,FrameWork 負責解密,對應的模組就是 datax-core,從它的 Engine.java 切入

為什麼從 Engine.java 切入,可以看看 異構資料來源同步之資料同步 → datax 改造,有點意思

另外,Engine.java 的描述也說明了

Engine是DataX入口類,該類負責初始化Job或者Task的執行容器,並執行外掛的Job或者Task邏輯

main 一步一步往下跟

Engine#main > Engine#entry > ConfigParser#parse > ConfigParser#parseJobConfig > SecretUtil#decryptSecretKey

decrypt 大家都知道是什麼意思吧,所以 SecretUtil.java 中肯定有我們要找的加密方法

多個加密方法

但我們會發現有好幾個,我們應該用哪個?憑感覺的話應該是 encryptRSA,但作為一個開發者,我們不能只憑感覺,我們需要的準確的答案。如何尋找準備的答案了?

從解密處找答案,解密用的哪個方法,可以準確的推出加密方法

那就繼續跟進 SecretUtil#decryptSecretKey

decryptSecretKey

程式碼不短,但我們暫時只需要關注圖中標明的 2 點

  1. 是否需要解密

    還記得前面提到的問題嗎

    配置了密文,需不需要透過額外的配置告知 DataX 需要解密

    所以 DataX 是透過配置項 job.setting.keyVersion 來判斷是否需要解密,得到明確的答案,我們就不用去嘗試了

  2. 對包含 * 號的 key 解密

    我們跟進 SecretUtil.decrypt

    public static String decrypt(String data, String key, String method) {
    	if (SecretUtil.KEY_ALGORITHM_RSA.equals(method)) {
    		return SecretUtil.decryptRSA(data, key);
    	} else if (SecretUtil.KEY_ALGORITHM_3DES.equals(method)) {
    		return SecretUtil.decrypt3DES(data, key);
    	} else {
    		throw DataXException.asDataXException(
    				FrameworkErrorCode.SECRET_ERROR,
    				String.format("系統程式設計錯誤,不支援的加密型別", method));
    	}
    }
    

    程式碼並不長,但我們發現除了支援 RSA 解密,還支援 3DES 解密,這與官方文件說的

    DataX框架支援對特定的配置項進行RSA加密

    有點不一樣,為什麼不把 3DES 加進去?這個後面再分析,我們繼續看 RSA

所以對應的 RSA 解密方法是:SecretUtil.decryptRSA,那對應的加密方法肯定就是 SecretUtil.encryptRSA,但為了嚴謹,我們需要驗證下,如何驗證了,其實很簡單,SecretUtil.encryptRSA 對明文加密得到密文,然後用 SecretUtil.decryptRSA 對密文進行解密,看能否得到最初的明文

但問題又來了,encryptRSA 需要 公鑰

/**
 * 加密<br>
 * 用公鑰加密 encryptByPublicKey
 *
 * @param data 裸的原始資料
 * @param key  經過base64加密的公鑰
 * @return 結果也採用base64加密
 * @throws Exception
 */
public static String encryptRSA(String data, String key) {
	try {
		// 對公鑰解密,公鑰被base64加密過
		byte[] keyBytes = decryptBASE64(key);

		// 取得公鑰
		X509EncodedKeySpec x509KeySpec = new X509EncodedKeySpec(keyBytes);
		KeyFactory keyFactory = KeyFactory.getInstance(KEY_ALGORITHM_RSA);
		Key publicKey = keyFactory.generatePublic(x509KeySpec);

		// 對資料加密
		Cipher cipher = Cipher.getInstance(keyFactory.getAlgorithm());
		cipher.init(Cipher.ENCRYPT_MODE, publicKey);

		return encryptBASE64(cipher.doFinal(data.getBytes(ENCODING)));
	} catch (Exception e) {
		throw DataXException.asDataXException(
				FrameworkErrorCode.SECRET_ERROR, "rsa加密出錯", e);
	}
}

decryptRSA 需要 私鑰

/**
 * 解密<br>
 * 用私鑰解密
 *
 * @param data 已經經過base64加密的密文
 * @param key  已經經過base64加密私鑰
 * @return
 * @throws Exception
 */
public static String decryptRSA(String data, String key) {
	try {
		// 對金鑰解密
		byte[] keyBytes = decryptBASE64(key);

		// 取得私鑰
		PKCS8EncodedKeySpec pkcs8KeySpec = new PKCS8EncodedKeySpec(keyBytes);
		KeyFactory keyFactory = KeyFactory.getInstance(KEY_ALGORITHM_RSA);
		Key privateKey = keyFactory.generatePrivate(pkcs8KeySpec);

		// 對資料解密
		Cipher cipher = Cipher.getInstance(keyFactory.getAlgorithm());
		cipher.init(Cipher.DECRYPT_MODE, privateKey);

		return new String(cipher.doFinal(decryptBASE64(data)), ENCODING);
	} catch (Exception e) {
		throw DataXException.asDataXException(
				FrameworkErrorCode.SECRET_ERROR, "rsa解密出錯", e);
	}
}

上哪去獲取 公鑰私鑰

愁人

假設是我們實現工具類 SecretUtil,我們要不要提供獲取 公鑰私鑰 的方法?很顯然是要的,因為 加密解密 分別需要用到 公鑰私鑰,所以從完整性考慮,肯定提供獲取 公鑰私鑰 的方法

自己動手,豐衣足食

同理,SecretUtil 也提供了獲取 公鑰 和 私鑰的方法

/**
 * 初始化金鑰 for RSA ALGORITHM
 *
 * @return
 * @throws Exception
 */
public static String[] initKey() throws Exception {
	KeyPairGenerator keyPairGen = KeyPairGenerator
			.getInstance(KEY_ALGORITHM_RSA);
	keyPairGen.initialize(1024);

	KeyPair keyPair = keyPairGen.generateKeyPair();

	// 公鑰
	RSAPublicKey publicKey = (RSAPublicKey) keyPair.getPublic();

	// 私鑰
	RSAPrivateKey privateKey = (RSAPrivateKey) keyPair.getPrivate();

	String[] publicAndPrivateKey = {
			encryptBASE64(publicKey.getEncoded()),
			encryptBASE64(privateKey.getEncoded())};

	return publicAndPrivateKey;
}

測試程式碼如下

public static void main(String[] args) throws Exception {
	// 獲取公鑰與私鑰
	String[] keys = SecretUtil.initKey();
	String publicKey = keys[0];
	String privateKey = keys[1];
	System.out.println("publicKey = " + publicKey);
	System.out.println("privateKey = " + privateKey);

	// 透過公鑰加密
	String encryptData = SecretUtil.encryptRSA("hello_qsl", publicKey);
	System.out.println("encryptData = " + encryptData);

	// 透過私鑰解密
	String decryptData = SecretUtil.decryptRSA(encryptData, privateKey);
	System.out.println("decryptData = " + decryptData);
}

至於結果正確與否,你們自己去執行

我都把飯喂到你們嘴裡了,莫非還要我替你們去吃?

你們不要太過分

使用密文

密文已經獲取到了,接下來就是在 DataX 中使用密文了

  1. 配置 .secret.properties公鑰私鑰

    檔案在 DataXhome 目錄的 conf 目錄下

    #ds basicAuth config
    auth.user=
    auth.pass=
    current.keyVersion=v1
    current.publicKey=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCeBjq0zrij7A6la1y9gONHrC3dV1r8U4HCJ0expJ6K9xeW1/RYUc+s4b4pEQjbeSX2BlmOzCXPcc2s26+UpHLHl9Cy1alix/HGf3tOubuAKsbF+MKOd/sLGtLoFr4iMoCHj+KNVRBHlQN5WsrxehRwQaqWycl2Rd2wY6orL0xZ0QIDAQAB
    current.privateKey=MIICeAIBADANBgkqhkiG9w0BAQEFAASCAmIwggJeAgEAAoGBAJ4GOrTOuKPsDqVrXL2A40esLd1XWvxTgcInR7Gknor3F5bX9FhRz6zhvikRCNt5JfYGWY7MJc9xzazbr5SkcseX0LLVqWLH8cZ/e065u4AqxsX4wo53+wsa0ugWviIygIeP4o1VEEeVA3layvF6FHBBqpbJyXZF3bBjqisvTFnRAgMBAAECgYEAhtcl7PagUy+wZ7KvFf0O8y+Wi1JpDvpqtLMz1/9yUX36oPpxQ5O7s/eEfiJM/onnvIE6lkDY2qRvLlre/eU9En4f964p6Fl0yWMalDmCv8MYGNEBu8rzn+GKH55xzm+Z5shs7mvzFWYJNeHIHCI8fmHnscFURB8VYEgAvbvtHAECQQDePgz/j4rTyqYeFjzuwZe7wUlQzex2NNdJ/aP4RY2+v+N5lZbT0SomIJZhIf5uqY+Z3lmEEyLWEikiDD6GkAihAkEAtgcLQJ6D4XOujJwD8KWm9m78yKXTrEgk57Qpy0bQq9tF2ygd6m2u8oEo9x+3YpN2J9RaTykjyOP8YwoSW97TMQJBAIgWkRkRCd7E8dHspiVBsKtNIZr0bf64PrjVM0n9NV3/3Mh//Fr6cwfj3pHeIhIbjI6ZJFGG8kcJ2dw6iTMXEeECQD6N6SYJ05SU5rVXoFsA8oHZ3nEt27JnEJe36Gz9JxUIQ9duz+kSTH72OBfFBIaR2pcReP+fSbbt8nwup+R+jOECQQCciE2p6iQcTnJyMuSQFLoTB47qSx0EmdQNIcLdHuAxagWrfphlPJMFPJilWWgaqyoP0GkzmPxak5Jd9T7bv7yR
    current.service.username=
    current.service.password=
    

    current.keyVersion 也需要配置,並且後續 job.json 的配置 job.setting.keyVersion 的值與該值一致

  2. 配置 job.json

    有 2 種配置,其 1 是需要配置:job.setting.keyVersion,其值與 .secret.propertiescurrent.keyVersion 值一致,其 2 是 readerwriter 的賬密 key 需要以 * 開頭,並且其值需要置成加密後的密文,完整的 mysql2Mysql.json

    {
        "job": {
            "setting": {
                "speed": {
                    "channel": 5
                },
                "errorLimit": {
                    "record": 0,
                    "percentage": 0.02
                },
                "keyVersion": "v1"
            },
            "content": [
                {
                    "reader": {
                        "name": "mysqlreader",
                        "parameter": {
                            "*username": "HisZeJWc51c+8B54AbJ9wQDTJ49C1kBlc1hKUnDgi1NaTdqsgHwRc3Y4PdM5xf0fCLRoYlLSO/KRZJcy9CGIQt9uvJy3bkbG01RwO4qMoS+nQJ28S8p/I3rVUlAEkI/eE/PFWBnAU2U4xF2XjlMFrCG2yetAlZuwsN4paQaBmj4=",
                            "*password": "Ebh0U200enVevXaJs6M0t4yvPo5upcL8RUBN2j1Xi59a8UF8iSPbCl/m5YcX4N9JcJH6VPdsA9kfDJHv6tArnCsH3f5JDWwapOv03lW6B3Nte89e+7Ex7tE6J5+IkFIxaxeYOGoTFr+NBf5t4DWzK0tvH2xAVTgiPHyL/gisiZI=",
                            "column": [
                                "id",
                                "username",
                                "password",
                                "birth_day",
                                "remark"
                            ],
                            "connection": [
                                {
                                    "jdbcUrl": [
                                        "jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8"
                                    ],
                                    "table": [
                                        "qsl_datax_source"
                                    ]
                                }
                            ]
                        }
                    },
                    "writer": {
                        "name": "mysqlwriter",
                        "parameter": {
                            "writeMode": "insert",
                            "*username": "HisZeJWc51c+8B54AbJ9wQDTJ49C1kBlc1hKUnDgi1NaTdqsgHwRc3Y4PdM5xf0fCLRoYlLSO/KRZJcy9CGIQt9uvJy3bkbG01RwO4qMoS+nQJ28S8p/I3rVUlAEkI/eE/PFWBnAU2U4xF2XjlMFrCG2yetAlZuwsN4paQaBmj4=",
                            "*password": "Ebh0U200enVevXaJs6M0t4yvPo5upcL8RUBN2j1Xi59a8UF8iSPbCl/m5YcX4N9JcJH6VPdsA9kfDJHv6tArnCsH3f5JDWwapOv03lW6B3Nte89e+7Ex7tE6J5+IkFIxaxeYOGoTFr+NBf5t4DWzK0tvH2xAVTgiPHyL/gisiZI=",
                            "column": [
                                "id",
                                "username",
                                "pw",
                                "birth_day",
                                "note"
                            ],
                            "connection": [
                                {
                                    "jdbcUrl": "jdbc:mysql://192.168.2.118:3306/qsl_datax_sync?useUnicode=true&characterEncoding=utf-8",
                                    "table": [
                                        "qsl_datax_target"
                                    ]
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
    

然後執行資料同步

datax.py ../job/mysql2Mysql.json

輸出日誌如下

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


2024-07-13 23:49:17.313 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2024-07-13 23:49:17.315 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2024-07-13 23:49:17.321 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2024-07-13 23:49:17.323 [main] INFO  Engine - the machine info  =>

        osInfo: Windows 10 amd64 10.0
        jvmInfo:        Oracle Corporation 1.8 25.251-b08
        cpu num:        8

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [PS MarkSweep, PS Scavenge]

        MEMORY_NAME                    | allocation_size                | init_size
        PS Eden Space                  | 256.00MB                       | 256.00MB
        Code Cache                     | 240.00MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        PS Survivor Space              | 42.50MB                        | 42.50MB
        PS Old Gen                     | 683.00MB                       | 683.00MB
        Metaspace                      | -0.00MB                        | 0.00MB


2024-07-13 23:49:17.331 [main] INFO  Engine -
{
        "setting":{
                "speed":{
                        "channel":5
                },
                "errorLimit":{
                        "record":0,
                        "percentage":0.02
                },
                "keyVersion":"v1"
        },
        "content":[
                {
                        "reader":{
                                "name":"mysqlreader",
                                "parameter":{
                                        "column":[
                                                "id",
                                                "username",
                                                "password",
                                                "birth_day",
                                                "remark"
                                        ],
                                        "connection":[
                                                {
                                                        "jdbcUrl":[
                                                                "jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8"
                                                        ],
                                                        "table":[
                                                                "qsl_datax_source"
                                                        ]
                                                }
                                        ],
                                        "username":"root",
                                        "password":"******"
                                }
                        },
                        "writer":{
                                "name":"mysqlwriter",
                                "parameter":{
                                        "writeMode":"insert",
                                        "column":[
                                                "id",
                                                "username",
                                                "pw",
                                                "birth_day",
                                                "note"
                                        ],
                                        "connection":[
                                                {
                                                        "jdbcUrl":"jdbc:mysql://192.168.2.118:3306/qsl_datax_sync?useUnicode=true&characterEncoding=utf-8",
                                                        "table":[
                                                                "qsl_datax_target"
                                                        ]
                                                }
                                        ],
                                        "username":"root",
                                        "password":"******"
                                }
                        }
                }
        ]
}

2024-07-13 23:49:17.342 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false
2024-07-13 23:49:17.342 [main] INFO  JobContainer - DataX jobContainer starts job.
2024-07-13 23:49:17.343 [main] INFO  JobContainer - Set jobId = 0
Sat Jul 13 23:49:17 GMT+08:00 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
2024-07-13 23:49:17.614 [job-0] INFO  OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
Sat Jul 13 23:49:17 GMT+08:00 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
2024-07-13 23:49:17.635 [job-0] INFO  OriginalConfPretreatmentUtil - table:[qsl_datax_source] has columns:[id,username,password,birth_day,remark].
2024-07-13 23:49:17.796 [job-0] INFO  OriginalConfPretreatmentUtil - table:[qsl_datax_target] all columns:[
id,username,pw,birth_day,note
].
2024-07-13 23:49:17.801 [job-0] INFO  OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id,username,pw,birth_day,note) VALUES(?,?,?,?,?)
], which jdbcUrl like:[jdbc:mysql://192.168.2.118:3306/qsl_datax_sync?useUnicode=true&characterEncoding=utf-8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&rewriteBatchedStatements=true&tinyInt1isBit=false]
2024-07-13 23:49:17.801 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2024-07-13 23:49:17.802 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do prepare work .
2024-07-13 23:49:17.802 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2024-07-13 23:49:17.803 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2024-07-13 23:49:17.803 [job-0] INFO  JobContainer - Job set Channel-Number to 5 channels.
2024-07-13 23:49:17.806 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] splits to [1] tasks.
2024-07-13 23:49:17.807 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2024-07-13 23:49:17.825 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2024-07-13 23:49:17.826 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2024-07-13 23:49:17.828 [job-0] INFO  JobContainer - Running by standalone Mode.
2024-07-13 23:49:17.834 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2024-07-13 23:49:17.836 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2024-07-13 23:49:17.836 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2024-07-13 23:49:17.844 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2024-07-13 23:49:17.848 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Begin to read record by Sql: [select id,username,password,birth_day,remark from qsl_datax_source
] jdbcUrl:[jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
Sat Jul 13 23:49:17 GMT+08:00 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
2024-07-13 23:49:17.869 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Finished read record by Sql: [select id,username,password,birth_day,remark from qsl_datax_source
] jdbcUrl:[jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2024-07-13 23:49:18.247 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[403]ms
2024-07-13 23:49:18.247 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2024-07-13 23:49:27.842 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 80 bytes | Speed 8B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.012s | Percentage 100.00%
2024-07-13 23:49:27.842 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2024-07-13 23:49:27.843 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2024-07-13 23:49:27.843 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do post work.
2024-07-13 23:49:27.843 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2024-07-13 23:49:27.844 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: F:\datax\hook
2024-07-13 23:49:27.845 [job-0] INFO  JobContainer -
         [total cpu info] =>
                averageCpu                     | maxDeltaCpu                    | minDeltaCpu
                -1.00%                         | -1.00%                         | -1.00%


         [total gc info] =>
                 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime
                 PS MarkSweep         | 1                  | 1                  | 1                  | 0.014s             | 0.014s             | 0.014s
                 PS Scavenge          | 1                  | 1                  | 1                  | 0.006s             | 0.006s             | 0.006s

2024-07-13 23:49:27.845 [job-0] INFO  JobContainer - PerfTrace not enable!
2024-07-13 23:49:27.846 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 80 bytes | Speed 8B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.012s | Percentage 100.00%
2024-07-13 23:49:27.846 [job-0] INFO  JobContainer -
任務啟動時刻                    : 2024-07-13 23:49:17
任務結束時刻                    : 2024-07-13 23:49:27
任務總計耗時                    :                 10s
任務平均流量                    :                8B/s
記錄寫入速度                    :              0rec/s
讀出記錄總數                    :                   4
讀寫失敗總數                    :                   0

資料同步成功,我們注意看日誌中的 job.json

{
    "setting": {
        "speed": {
            "channel": 5
        },
        "errorLimit": {
            "record": 0,
            "percentage": 0.02
        },
        "keyVersion": "v1"
    },
    "content": [
        {
            "reader": {
                "name": "mysqlreader",
                "parameter": {
                    "column": [
                        "id",
                        "username",
                        "password",
                        "birth_day",
                        "remark"
                    ],
                    "connection": [
                        {
                            "jdbcUrl": [
                                "jdbc:mysql://192.168.2.118:3307/qsl_datax?useUnicode=true&characterEncoding=utf-8"
                            ],
                            "table": [
                                "qsl_datax_source"
                            ]
                        }
                    ],
                    "username": "root",
                    "password": "******"
                }
            },
            "writer": {
                "name": "mysqlwriter",
                "parameter": {
                    "writeMode": "insert",
                    "column": [
                        "id",
                        "username",
                        "pw",
                        "birth_day",
                        "note"
                    ],
                    "connection": [
                        {
                            "jdbcUrl": "jdbc:mysql://192.168.2.118:3306/qsl_datax_sync?useUnicode=true&characterEncoding=utf-8",
                            "table": [
                                "qsl_datax_target"
                            ]
                        }
                    ],
                    "username": "root",
                    "password": "******"
                }
            }
        }
    ]
}

可以看出 FrameWork 完成解密後,帶* 號的 keyusernamepassword)已經不帶 * 號,其值也已經被解密成了明文

"password": "**********",只是進行日誌列印的時候,一個明文字元被替換成一個 *,而實際傳給外掛的是明文密碼

所以外掛並不感知加密解密過程,這就是官方文件說的

配置項加密解密過程對外掛是透明,外掛仍然以不帶 * 的key來查詢配置和操作配置項

解密

官方文件只提到了 RSA,但實際程式碼中還提供了 3DES ,為什麼官方文件中不提及 3DES 了,我們得從解密中找答案

還記得前面講到的 SecretUtil#decryptSecretKey 嗎,我們得繼續看這個方法,但只需要分析其中部分程式碼

獲取金鑰配置

主要分兩塊

  1. getPrivateKeyMap 獲取 金鑰

    方法程式碼比較長,就不展示程式碼了,我直接給你們梳理下流程

    本地快取 versionKeyMap 型別是 Map<String, Triple<String, String, String>>key 是就是 .secret.propertiescurrent.keyVersion 的值,對應到我們案例中,就是 v1valueTriple 型別,含有三個欄位,leftmiddleright,值分別對應 privateKey加密演算法publicKey

    如果 versionKeyMapnull,則讀取 .secret.properties 內容,放入 versionKeyMap 中;如果不是 null,則直接返回。重點程式碼來了,大家注意看

    getPrivateKeyMap

    紅框框住的程式碼,相信大家都能看懂,keyVersion 的值就是配置項 current.keyVersion 的值,privateKey 的值就是配置項 current.privateKey 的值,publicKey 的值就是配置項 current.publicKey 的值

    大家注意看

    versionKeyMap.put(keyVersion, ImmutableTriple.of(
    	privateKey, SecretUtil.KEY_ALGORITHM_RSA,
    	publicKey))
    

    這裡直接將加密演算法固定成 RSA 了,根本就沒有 if 分支去指定 3DES 演算法,所以了?

    DataX 暫時確實不支援 3DES 加解密,只支援 RSA 加解密

    或者說 3DES 加解密只實現了部分,未實現全部,最終還是不支援 3DES,所以官方文件只說了 RSA,並未提及 3DES 是對的!

  2. 獲取 私鑰加密演算法

    decryptKey 就是 privateKey ,而 method 就是 加密演算法,其值就是 RSA。然後就是對 job.json* 開頭的 key 的值做解密處理

    // 對包含*號key解密處理
    for (String key : config.getKeys()) {
    	int lastPathIndex = key.lastIndexOf(".") + 1;
    	String lastPathKey = key.substring(lastPathIndex);
    	if (lastPathKey.length() > 1 && lastPathKey.charAt(0) == '*'
    			&& lastPathKey.charAt(1) != '*') {
    		Object value = config.get(key);
    		if (value instanceof String) {
    			String newKey = key.substring(0, lastPathIndex)
    					+ lastPathKey.substring(1);
    			config.set(newKey,
    					SecretUtil.decrypt((String) value, decryptKey, method));
    			config.addSecretKeyPath(newKey);
    			config.remove(key);
    		}
    	}
    }
    

    這裡就對應了為什麼加密項的 key 需要以 * 開頭

至此,相關的疑惑是不是都得到解答了,你們對 DataX 的敏感資訊加解密是不是完全懂了?

640 (4)

總結

  1. DataX 目前只支援 RSA 加解密,不支援 3DES,也不支援其他加解密演算法

    DataX 的加密演算法結合了 RSABASE64,而非只用 RSA,也就是通用的 RSA 工具生成的密碼不能用於 DataX

  2. FrameWork 有解密過程,但 金鑰密文 需要使用者自己生成,配置過程有好幾步,大家別漏了

    2.1 獲取 公鑰私鑰,並指定 keyVersion,配置到 .secret.properties

    2.2 在 job.json 中配置 job.setting.keyVersion,其值與 2.1 的 keyVersion 值一致

    2.3 job.json 中敏感配置項,key* 開頭,value明文 經過 SecretUtil#encryptRSA 得到的密文

相關文章