隱私計算 FATE - 多分類神經網路演算法測試

小程式開發發表於2022-07-12

一、說明

本文分享基於  Fate 使用  橫向聯邦  神經網路演算法 對  多分類 的資料進行  模型訓練,並使用該模型對資料進行  多分類預測

  • 二分類演算法:是指待預測的 label 標籤的取值只有兩種;直白來講就是每個例項的可能類別只有兩種 (0 或者 1),例如性別只有   或者  ;此時的分類演算法其實是在構建一個分類線將資料劃分為兩個類別。
  • 多分類演算法:是指待預測的 label 標籤的取值可能有多種情況,例如個人愛好可能有  籃球足球電影 等等多種型別。常見演算法:Softmax、SVM、KNN、決策樹。

二、準備訓練資料

上傳到 Fate 裡的資料有兩個欄位名必需是規定的,分別是主鍵為  id 欄位和分類欄位為  y 欄位, y 欄位就是所謂的待預測的 label 標籤;其他的特徵欄位 (屬性) 可任意填寫,例如下面例子中的  x0 -  x9

例如有一條使用者資料為:  收入 : 10000, 負債 : 5000, 是否有還款能力 : 1 ;資料中的  收入 和  負債 就是特徵欄位,而  是否有還款能力 就是分類欄位。

2.1. guest 端

10 條資料,包含 1 個分類欄位  y 和 10 個標籤欄位  x0 -  x9

y 值有 0、1、2、3 四個分類

上傳到 Fate 中,表名為  muti_breast_homo_guest 名稱空間為  experiment

 

2.2. host 端

10 條資料,欄位與 guest 端一樣,但是內容不一樣

上傳到 Fate 中,表名為  muti_breast_homo_host 名稱空間為  experiment

 

三、執行訓練任務

3.1. 準備 dsl 檔案

建立檔案  homo_nn_dsl.json 內容如下 :

{    "components": {        "reader_0": {            "module": "Reader",            "output": {                "data": [                    "data"
                ]
            }
        },        "data_transform_0": {            "module": "DataTransform",            "input": {                "data": {                    "data": [                        "reader_0.data"
                    ]
                }
            },            "output": {                "data": [                    "data"
                ],                "model": [                    "model"
                ]
            }
        },        "homo_nn_0": {            "module": "HomoNN",            "input": {                "data": {                    "train_data": [                        "data_transform_0.data"
                    ]
                }
            },            "output": {                "data": [                    "data"
                ],                "model": [                    "model"
                ]
            }
        }
    }
}

 

3.2. 準備 conf 檔案

建立檔案  homo_nn_multi_label_conf.json 內容如下 :

{    "dsl_version": 2,    "initiator": {        "role": "guest",        "party_id": 9999
    },    "role": {        "arbiter": [            10000
        ],        "host": [            10000
        ],        "guest": [            9999
        ]
    },    "component_parameters": {        "common": {            "data_transform_0": {                "with_label": true
            },            "homo_nn_0": {                "encode_label": true,                "max_iter": 15,                "batch_size": -1,                "early_stop": {                    "early_stop": "diff",                    "eps": 0.0001
                },                "optimizer": {                    "learning_rate": 0.05,                    "decay": 0.0,                    "beta_1": 0.9,                    "beta_2": 0.999,                    "epsilon": 1e-07,                    "amsgrad": false,                    "optimizer": "Adam"
                },                "loss": "categorical_crossentropy",                "metrics": [                    "accuracy"
                ],                "nn_define": {                    "class_name": "Sequential",                    "config": {                        "name": "sequential",                        "layers": [
                            {                                "class_name": "Dense",                                "config": {                                    "name": "dense",                                    "trainable": true,                                    "batch_input_shape": [                                        null,                                        18
                                    ],                                    "dtype": "float32",                                    "units": 5,                                    "activation": "relu",                                    "use_bias": true,                                    "kernel_initializer": {                                        "class_name": "GlorotUniform",                                        "config": {                                            "seed": null,                                            "dtype": "float32"
                                        }
                                    },                                    "bias_initializer": {                                        "class_name": "Zeros",                                        "config": {                                            "dtype": "float32"
                                        }
                                    },                                    "kernel_regularizer": null,                                    "bias_regularizer": null,                                    "activity_regularizer": null,                                    "kernel_constraint": null,                                    "bias_constraint": null
                                }
                            },
                            {                                "class_name": "Dense",                                "config": {                                    "name": "dense_1",                                    "trainable": true,                                    "dtype": "float32",                                    "units": 4,                                    "activation": "sigmoid",                                    "use_bias": true,                                    "kernel_initializer": {                                        "class_name": "GlorotUniform",                                        "config": {                                            "seed": null,                                            "dtype": "float32"
                                        }
                                    },                                    "bias_initializer": {                                        "class_name": "Zeros",                                        "config": {                                            "dtype": "float32"
                                        }
                                    },                                    "kernel_regularizer": null,                                    "bias_regularizer": null,                                    "activity_regularizer": null,                                    "kernel_constraint": null,                                    "bias_constraint": null
                                }
                            }
                        ]
                    },                    "keras_version": "2.2.4-tf",                    "backend": "tensorflow"
                },                "config_type": "keras"
            }
        },        "role": {            "host": {                "0": {                    "reader_0": {                        "table": {                            "name": "muti_breast_homo_host",                            "namespace": "experiment"
                        }
                    }
                }
            },            "guest": {                "0": {                    "reader_0": {                        "table": {                            "name": "muti_breast_homo_guest",                            "namespace": "experiment"
                        }
                    }
                }
            }
        }
    }
}

注意  reader_0 元件的表名和名稱空間需與上傳資料時配置的一致。

 

3.3. 提交任務

執行以下命令:

flow job submit -d homo_nn_dsl.json -c homo_nn_multi_label_conf.json

執行成功後,檢視  dashboard 顯示:

 

四、準備預測資料

與前面訓練的資料欄位一樣,但是內容不一樣, y 值全為 0

4.1. guest 端

上傳到 Fate 中,表名為  predict_muti_breast_homo_guest 名稱空間為  experiment

 

4.2. host 端

上傳到 Fate 中,表名為  predict_muti_breast_homo_host 名稱空間為  experiment

 

五、準備預測配置

建立檔案  homo_nn_multi_label_predict.json 內容如下 :

{    "dsl_version": 2,    "initiator": {        "role": "guest",        "party_id": 9999
    },    "role": {        "arbiter": [            10000
        ],        "host": [            10000
        ],        "guest": [            9999
        ]
    },    "job_parameters": {        "common": {            "model_id": "arbiter-10000#guest-9999#host-10000#model",            "model_version": "202207061504081543620",            "job_type": "predict"
        }
    },    "component_parameters": {        "role": {            "guest": {                "0": {                    "reader_0": {                        "table": {                            "name": "predict_muti_breast_homo_guest",                            "namespace": "experiment"
                        }
                    }
                }
            },            "host": {                "0": {                    "reader_0": {                        "table": {                            "name": "predict_muti_breast_homo_host",                            "namespace": "experiment"
                        }
                    }
                }
            }
        }
    }
}

注意以下兩點:

  1. model_id 和  model_version 需修改為模型部署後的版本號。

  2. reader_0 元件的表名和名稱空間需與上傳資料時配置的一致。

 

六、執行預測任務

執行以下命令:

flow job submit -c homo_nn_multi_label_predict.json

執行成功後,檢視  homo_nn_0 元件的資料輸出:

可以看到演算法輸出的預測結果。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/70019616/viewspace-2905330/,如需轉載,請註明出處,否則將追究法律責任。

相關文章