基於Python和TensorFlow實現BERT模型應用

华为云开发者联盟發表於2024-06-26

原文網址 : https://www.cnblogs.com/huaweiyun/p/18268255

本文分享自華為雲社群《使用Python實現深度學習模型：BERT模型教程》，作者： Echo_Wish。

BERT（Bidirectional Encoder Representations from Transformers）是Google提出的一種用於自然語言處理（NLP）的預訓練模型。BERT透過雙向訓練Transformer，能夠捕捉到文字中詞語的上下文資訊，是NLP領域的一個里程碑。

在本文中，我們將詳細介紹BERT模型的基本原理，並使用Python和TensorFlow實現一個簡單的BERT模型應用。

1. BERT模型簡介

1.1 Transformer模型複習

BERT基於Transformer架構。Transformer由編碼器（Encoder）和解碼器（Decoder）組成，但BERT只使用編碼器部分。編碼器的主要元件包括：

多頭自注意力機制（Multi-Head Self-Attention）：計算序列中每個位置對其他位置的注意力分數。
前饋神經網路（Feed-Forward Neural Network）：對每個位置的表示進行獨立的非線性變換。

1.2 BERT的預訓練與微調

BERT的訓練分為兩步：

預訓練（Pre-training）：在大規模語料庫上進行無監督訓練，使用兩個任務：

遮蔽語言模型（Masked Language Model, MLM）：隨機遮蔽輸入文字中的一些詞，並要求模型預測這些被遮蔽的詞。
下一句預測（Next Sentence Prediction, NSP）：給定句子對，預測第二個句子是否是第一個句子的下文。

微調（Fine-tuning）：在特定任務上進行有監督訓練，如分類、問答等。

2. 使用Python和TensorFlow實現BERT模型

2.1 安裝依賴

首先，安裝必要的Python包，包括TensorFlow和Transformers（Hugging Face的庫）。

pip install tensorflow transformers

2.2 載入預訓練BERT模型

我們使用Hugging Face的Transformers庫載入預訓練的BERT模型和對應的分詞器（Tokenizer）。

import tensorflow as tf
from transformers import BertTokenizer, TFBertModel

# 載入預訓練的BERT分詞器和模型
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = TFBertModel.from_pretrained('bert-base-uncased')

2.3 資料預處理

我們將使用一個簡單的句子分類任務作為示例。假設我們有以下資料：

sentences = ["I love machine learning.", "BERT is a powerful model.", "I enjoy studying AI."]
labels = [1, 1, 1]  # 假設1表示積極，0表示消極

我們需要將句子轉換為BERT輸入格式，包括輸入ID、注意力掩碼等。

# 將句子轉換為BERT輸入格式
input_ids = []
attention_masks = []

for sentence in sentences:
    encoded_dict = tokenizer.encode_plus(
                        sentence,                      # 輸入文字
                        add_special_tokens = True,     # 新增特殊[CLS]和[SEP]標記
                        max_length = 64,               # 填充和截斷長度
                        pad_to_max_length = True,
                        return_attention_mask = True,  # 返回注意力掩碼
                        return_tensors = 'tf'          # 返回TensorFlow張量
                   )
    
    input_ids.append(encoded_dict['input_ids'])
    attention_masks.append(encoded_dict['attention_mask'])

input_ids = tf.concat(input_ids, axis=0)
attention_masks = tf.concat(attention_masks, axis=0)
labels = tf.convert_to_tensor(labels)

2.4 構建BERT分類模型

我們在預訓練的BERT模型基礎上新增一個分類層。

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model

class BertClassifier(Model):
    def __init__(self, bert):
        super(BertClassifier, self).__init__()
        self.bert = bert
        self.dropout = tf.keras.layers.Dropout(0.3)
        self.classifier = Dense(1, activation='sigmoid')

    def call(self, input_ids, attention_mask):
        outputs = self.bert(input_ids, attention_mask=attention_mask)
        pooled_output = outputs[1]
        pooled_output = self.dropout(pooled_output)
        return self.classifier(pooled_output)

# 例項化BERT分類模型
bert_classifier = BertClassifier(model)

2.5 編譯和訓練模型

編譯模型並進行訓練。

# 編譯模型
optimizer = tf.keras.optimizers.Adam(learning_rate=2e-5)
loss = tf.keras.losses.BinaryCrossentropy()
metric = tf.keras.metrics.BinaryAccuracy()

bert_classifier.compile(optimizer=optimizer, loss=loss, metrics=[metric])

# 訓練模型
bert_classifier.fit([input_ids, attention_masks], labels, epochs=3, batch_size=2)

2.6 評估模型

訓練完成後，我們可以對新資料進行預測。

# 預測新句子
new_sentences = ["AI is fascinating.", "I dislike machine learning."]
new_input_ids = []
new_attention_masks = []

for sentence in new_sentences:
    encoded_dict = tokenizer.encode_plus(
                        sentence,
                        add_special_tokens = True,
                        max_length = 64,
                        pad_to_max_length = True,
                        return_attention_mask = True,
                        return_tensors = 'tf'
                   )
    
    new_input_ids.append(encoded_dict['input_ids'])
    new_attention_masks.append(encoded_dict['attention_mask'])

new_input_ids = tf.concat(new_input_ids, axis=0)
new_attention_masks = tf.concat(new_attention_masks, axis=0)

# 進行預測
predictions = bert_classifier.predict([new_input_ids, new_attention_masks])
print(predictions)

3. 總結

在本文中，我們詳細介紹了BERT模型的基本原理，並使用Python和TensorFlow實現了一個簡單的BERT分類模型。透過本文的教程，希望你能夠理解BERT模型的工作原理和實現方法，並能夠應用於自己的任務中。隨著對BERT模型的理解加深，你可以嘗試實現更復雜的任務，如問答系統、命名實體識別等。

點選關注，第一時間瞭解華為雲新鮮技術~

DSSM模型和tensorflow實現
2018-08-28
SSM模型
基於Python的Xgboost模型實現
2020-04-03
Python模型
基於"堆"的底層實現和應用
2019-03-09
XLM — 基於BERT的跨語言模型
2019-08-23
模型
TensorFlow 呼叫預訓練好的模型—— Python 實現
2018-10-10
模型Python
基於MindSpore實現BERT對話情緒識別
2024-07-16
情緒識別
BERT模型在京東零售業務的應用實踐
2019-03-30
模型
基於pytorch實現模型剪枝
2023-02-23
PyTorch模型
獲取和生成基於TensorFlow的MobilNet預訓練模型
2020-11-03
模型
DKT模型及其TensorFlow實現（Deep knowledge tracing with Tensorflow）
2021-12-25
模型
基於AlexNet和Inception模型思想的TFCNet模型設計與實現
2020-12-19
模型
基於Python實現MapReduce
2024-05-14
Python
基於bert架構的精準知識表徵模型
2019-12-12
架構模型
基於卷積神經網路和tensorflow實現的人臉識別
2018-04-01
卷積神經網路
用python實現基於凝固度和自由度的新詞發現程式
2019-09-09
Python
使用LSTM模型做股票預測【基於Tensorflow】
2020-11-26
模型
基於Python和Tensorflow的電影推薦演算法
2018-05-16
Python演算法
基於Tensorflow + Opencv 實現CNN自定義影像分類
2021-09-22
OpenCVCNN
Python 載入 TensorFlow 模型
2024-08-19
Python模型
【TensorFlow篇】--DNN初始和應用
2018-03-28
DNN
真實案例：使用LLM大模型及BERT模型實現合同審查系統
2024-08-15
大模型
Python基於TCP實現聊天功能
2024-03-09
PythonTCP
基於Spring框架應用的許可權控制系統的研究和實現
2019-08-06
Spring框架
【TensorFlow篇】--Tensorflow框架實現SoftMax模型識別手寫數字集
2018-03-28
框架模型
從基礎到高階應用，詳解用Python實現容器化和微服務架構
2024-07-17
Python微服務架構
整合學習(1)AdaBoost分別應用於分類和迴歸及其python實現
2020-12-03
Python
基於大模型的人工智慧應用開發
2024-06-02
大模型人工智慧
基於TensorFlow的深度學習實戰
2018-04-25
深度學習
基於 socket.io 快速實現一個實時通訊應用
2019-04-22
python基於opencv 實現影像時鐘
2021-01-05
PythonOpenCV
棧的應用和實現
2020-03-16
基於mtcnn/facenet/tensorflow實現人臉識別登入系統
2018-12-21
CNN
如何基於 Flutter 快速實現一個視訊通話應用
2021-11-17
Flutter
基於Python實現互動式資料視覺化的工具(用於Web)
2019-05-09
Python視覺化Web
Spring AI中使用嵌入模型和向量資料庫實現RAG應用
2024-03-17
SpringAI模型資料庫
【Tensorflow_DL_Note12】TensorFlow中LeNet-5模型的實現程式碼
2018-05-06
模型
基於GRU和am-softmax的句子相似度模型 | 附程式碼實現
2018-07-30
模型
LLM 大模型學習必知必會系列(十)：基於AgentFabric實現互動式智慧體應用,Agent實戰
2024-05-30
大模型智慧體