2018年最強自然語言模型 Google BERT 資源彙總

望江小汽車發表於2019-03-03

原文網址 : https://flycode.co/archives/265205

本文介紹了一種新的語言表徵模型 BERT——來自 Transformer 的雙向編碼器表徵。與最近的語言表徵模型不同，BERT 旨在基於所有層的左、右語境來預訓練深度雙向表徵。BERT 是首個在大批句子層面和 token 層面任務中取得當前最優效能的基於微調的表徵模型，其效能超越許多使用任務特定架構的系統，重新整理了 11 項 NLP 任務的當前最優效能記錄。

BERT 相關資源

加更BERT在中文和小資料集上的應用方法以及實驗效果

標題	說明	附加
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	原始論文	20181011
Reddit 討論	作者討論
BERT-pytorch	Google AI 2018 BERT pytorch implementation
論文解讀:BERT模型及fine-tuning	習翔宇論文解讀
最強NLP預訓練模型！谷歌BERT橫掃11項NLP任務記錄	論文淺析
【NLP】Google BERT詳解	李入魔解讀
如何評價 BERT 模型？	解讀論文思想點
NLP突破性成果 BERT 模型詳細解讀	章魚小丸子解讀
谷歌最強 NLP 模型 BERT 解讀	AI科技評論
預訓練BERT，官方程式碼釋出前他們是這樣用TensorFlow解決的	論文復現說明	20181030
谷歌終於開源BERT程式碼：3 億引數量，機器之心全面解讀		20181101
為什麼說 Bert 大力出奇跡？		20181121
BERT fine-tune 實踐終極教程	BERT在中文資料集上的fine tune全攻略	20181123
BERT在極小資料下帶來顯著提升的開源實現	張俊	20181127

BERT 論文內容精要

模型結構

其中的主要模組 Transformer 來自 Attention Is All You Need

模型輸入

預訓練方法

遮蔽語言模型（完形填空）和預測下一句任務。

實驗

模型分析

Effect of Pre-training Tasks

Effect of Model Size

Effect of Number of Training Steps

Feature-based Approach with BERT

結論

Recent empirical improvements due to transfer learning with language models have demonstrated that rich, unsupervised pre-training is an integral part of many language understanding systems. Inparticular, these results enable even low-resource tasks to benefit from very deep unidirectional architectures.Our major contribution is further generalizing these findings to deep bidirectional architectures, allowing the same pre-trained model to successfully tackle a broad set of NLP tasks. While the empirical results are strong, in some cases surpassing human performance, important future work is to investigate the linguistic phenomena that may or may not be captured by BERT.

BERT 相關資源

標題	說明	附加
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	原始論文	20181011
Reddit 討論	作者討論
BERT-pytorch	Google AI 2018 BERT pytorch implementation
論文解讀:BERT模型及fine-tuning	習翔宇論文解讀
最強NLP預訓練模型！谷歌BERT橫掃11項NLP任務記錄	論文淺析
【NLP】Google BERT詳解	李入魔解讀
如何評價 BERT 模型？	解讀論文思想點
NLP突破性成果 BERT 模型詳細解讀	章魚小丸子解讀
谷歌最強 NLP 模型 BERT 解讀	AI科技評論
預訓練BERT，官方程式碼釋出前他們是這樣用TensorFlow解決的	論文復現說明	20181030
谷歌終於開源BERT程式碼：3 億引數量，機器之心全面解讀		20181101

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (Submitted on 11 Oct 2018)

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%. Comments: 13 pages

摘要：本文介紹了一種新的語言表徵模型 BERT，意為來自 Transformer 的雙向編碼器表徵（Bidirectional Encoder Representations from Transformers）。與最近的語言表徵模型（Peters et al., 2018; Radford et al., 2018）不同，BERT 旨在基於所有層的左、右語境來預訓練深度雙向表徵。因此，預訓練的 BERT 表徵可以僅用一個額外的輸出層進行微調，進而為很多工（如問答和語言推斷任務）建立當前最優模型，無需對任務特定架構做出大量修改。

BERT 的概念很簡單，但實驗效果很強大。它重新整理了 11 個 NLP 任務的當前最優結果，包括將 GLUE 基準提升至 80.4%（7.6% 的絕對改進）、將 MultiNLI 的準確率提高到 86.7%（5.6% 的絕對改進），以及將 SQuAD v1.1 的問答測試 F1 得分提高至 93.2 分（提高 1.5 分）——比人類表現還高出 2 分。

Subjects: Computation and Language (cs.CL) Cite as: arXiv:1810.04805 [cs.CL] (or arXiv:1810.04805v1 [cs.CL] for this version) Bibliographic data Select data provider: Semantic Scholar [Disable Bibex(What is Bibex?)] No data available yet Submission history From: Jacob Devlin [view email] [v1] Thu, 11 Oct 2018 00:50:01 GMT (227kb,D)

Reddit 討論

官方復現 google-research bert

最近谷歌釋出了基於雙向 Transformer 的大規模預訓練語言模型，該預訓練模型能高效抽取文字資訊並應用於各種 NLP 任務，該研究憑藉預訓練模型重新整理了 11 項 NLP 任務的當前最優效能記錄。如果這種預訓練方式能經得起實踐的檢驗，那麼各種 NLP 任務只需要少量資料進行微調就能實現非常好的效果，BERT 也將成為一種名副其實的骨幹網路。

Introduction

BERT, or Bidirectional Encoder Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks.

Our academic paper which describes BERT in detail and provides full results on a number of tasks can be found here: arxiv.org/abs/1810.04….

To give a few numbers, here are the results on the SQuAD v1.1 question answering task:

SQuAD v1.1 Leaderboard (Oct 8th 2018)	Test EM	Test F1
1st Place Ensemble - BERT	87.4	93.2
2nd Place Ensemble - nlnet	86.0	91.7
1st Place Single Model - BERT	85.1	91.8
2nd Place Single Model - nlnet	83.5	90.1

And several natural language inference tasks:

System	MultiNLI	Question NLI	SWAG
BERT	86.7	91.1	86.3
OpenAI GPT (Prev. SOTA)	82.2	88.1	75.0

Plus many other tasks.

Moreover, these results were all obtained with almost no task-specific neural network architecture design.

If you already know what BERT is and you just want to get started, you can download the pre-trained models and run a state-of-the-art fine-tuning in only a few minutes.

復現 bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding

復現 BERT-keras

Keras implementation of BERT(Bidirectional Encoder Representations from Transformers)

復現 pytorch-pretrained-BERT

PyTorch version of Google AI's BERT model with script to load Google's pre-trained models.

BERT的資料集 GLUE

GLUE 來自論文 GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

摘要

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.

人工智慧大模型之開源大語言模型彙總（國內外開源專案模型彙總）
2024-06-21
人工智慧大模型
目前常用的自然語言處理開源專案/開發包大彙總
2018-11-26
自然語言處理
「NLP」一文彙總自然語言處理主要研究方向
2020-03-13
自然語言處理
XLM — 基於BERT的跨語言模型
2019-08-23
模型
從Word Embedding到Bert模型——自然語言處理預訓練技術發展史
2018-12-10
模型自然語言處理
我的2017年文章彙總——自然語言處理篇
2019-02-19
自然語言處理
有趣的自然語言處理資源集錦
2018-11-22
自然語言處理
自然語言處理常用資源筆記分享
2021-08-18
自然語言處理筆記
如何用最強模型BERT做NLP遷移學習？
2019-02-03
模型遷移學習
從Word Embedding到Bert模型—自然語言處理中的預訓練技術發展史
2018-11-19
模型自然語言處理
C語言知識彙總 | 00-C語言知識彙總目錄
2020-11-05
C語言
自然語言處理中的語言模型預訓練方法
2018-10-22
自然語言處理模型
c語言指標彙總
2019-08-03
C語言指標
探索自然語言處理：語言模型的發展與應用
2024-03-13
自然語言處理模型
Deepseek V3 成為迄今為止中國最強大的開源語言模型
2024-12-27
模型
重磅：谷歌釋出最強大AI模型【Google Gemini】
2023-12-14
谷歌AI模型Go
Go語言設計模式彙總
2019-07-18
Go設計模式
PHP 資源彙總
2018-12-06
PHP
牛津大學xDeepMind自然語言處理第13講語言模型（3）
2018-10-08
自然語言處理模型
百度開源自然語言理解模型 ERNIE 2.0，16 個 NLP 任務中碾壓 BERT 和 XLNet！
2019-08-06
模型
自然語言處理（NLP）系列（一）——自然語言理解（NLU）
2023-02-01
自然語言處理
go語言教程哪裡有？go 語言優秀開源專案彙總
2019-04-11
Go
12 種自然語言處理的開源工具
2020-02-25
自然語言處理開源工具
Python語言高頻重點彙總
2020-02-11
Python
資源連線彙總
2024-04-29
cpp website資源彙總
2024-03-14
Web
自然語言處理工具hanlp自定義詞彙新增圖解
2019-01-27
自然語言處理HanLP圖解
語義理解和研究資源是自然語言處理的兩大難題
2019-09-19
自然語言處理
掌握BERT：從初學者到高階的自然語言處理（NLP）全面指南
2024-07-09
自然語言處理
免費資源列表：想學自然語言處理的打包帶走！
2018-10-09
自然語言處理
在 Google Cloud 上輕鬆部署開放大語言模型
2024-04-12
GoCloud模型
【預訓練語言模型】使用Transformers庫進行BERT預訓練
2024-03-13
模型ORM
進一步改進GPT和BERT：使用Transformer的語言模型
2019-05-01
GPTORM模型
2024 CCF BDCI 小樣本條件下的自然語言至圖查詢語言翻譯大模型微調|Google T5預訓練語言模型訓練與PyTorch框架的使用
2024-11-24
大模型GoPyTorch框架
自訓練 + 預訓練 = 更好的自然語言理解模型
2020-11-13
模型
前端學習資源彙總
2019-02-16
前端
TensorFlow學習資源彙總
2019-03-30
React學習資源彙總
2018-06-05
React

2018年最強自然語言模型 Google BERT 資源彙總

BERT 相關資源

模型結構

模型輸入

預訓練方法

實驗

模型分析

Effect of Pre-training Tasks

Effect of Model Size

Effect of Number of Training Steps

Feature-based Approach with BERT

結論

BERT 相關資源

Introduction

相關文章