Stanford Natural Language Inference (SNLI)和Multi-Genre NLI Corpus(MultiNLI) 資料集

CopperDong發表於2018-03-12

原文網址 : https://blog.csdn.net/qfire/article/details/79529844

https://nlp.stanford.edu/projects/snli/
https://www.nyu.edu/projects/bowman/multinli/
MultiNLI是SNLI的升級版，格式一樣，規模相當，但是前者變化更多，也包含了一個輔助測試集用於cross-genre transfer 評估

SNLI1.0包含570，000的人工手寫英文句子對，人工標註了平衡的分類標籤:蘊含entailment,矛盾，中性
支援NLI(natural language inference)任務，也被視為RTE( recognizing textual entailment )任務

詳細介紹:
Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). [pdf] [bib]

除了gold label,還包含了5個標註人的評估結果，另外句子以兩種解析表示：

gold_label sentence1_binary_parse sentence2_binary_parse sentence1_parse sentence2_parse sentence1 sentence2 captionID pairID label1 label2 label3 label4 label5
neutral ( ( ( A person ) ( on ( a horse ) ) ) ( ( jumps ( over ( a ( broken ( down airplane ) ) ) ) ) . ) ) ( ( A person ) ( ( is ( ( training ( his horse ) ) ( for ( a competition ) ) ) ) . ) ) (ROOT (S (NP (NP (DT A) (NN person)) (PP (IN on) (NP (DT a) (NN horse)))) (VP (VBZ jumps) (PP (IN over) (NP (DT a) (JJ broken) (JJ down) (NN airplane)))) (. .))) (ROOT (S (NP (DT A) (NN person)) (VP (VBZ is) (VP (VBG training) (NP (PRP$ his) (NN horse)) (PP (IN for) (NP (DT a) (NN competition))))) (. .))) A person on a horse jumps over a broken down airplane. A person is training his horse for a competition. 3416050480.jpg#4 3416050480.jpg#4r1n neutral

A Survey of Natural Language Question Answering System
2018-08-04
論文閱讀-RankME: Reliable Human Ratings for Natural Language Generation
2020-10-09
know和know about的區別基於coca corpus
2024-12-07
資料查詢語句：DQL（Data Query Language）
2020-10-18
DQL（Date Query Language）資料庫查詢語句
2020-07-27
資料庫
資料型別和字符集
2019-01-19
資料型別
Google分析language垃圾資訊
2018-07-03
Go
Alink漫談(七) : 如何劃分訓練資料集和測試資料集
2020-06-12
機器學習中的有標註資料集和無標註資料集
2023-05-08
機器學習
Ubuntu下安裝Stanford CoreNLP
2022-02-20
Ubuntu
SciTech-BigDataAIML-Statistical Model-Bayes Inference-資料/事實 ∩ 假設: 政治經濟、社會和科學分析
2024-11-16
AI
Redis叢集模式和常用資料結構
2024-03-20
Redis模式資料結構
資料庫代理服務和叢集管理
2024-06-29
資料庫
人臉識別資料集和特點
2020-12-11
voc資料集轉換成coco資料集
2024-04-27
Redis資料型別, Redis主從哨兵和叢集(將資料匯入叢集) ubuntu使用
2024-10-05
Redis資料型別Ubuntu
UCI資料集整理（附論文常用資料集）
2018-08-30
Apache Spark：資料框，資料集和RDD之間的區別 - Baeldung
2020-10-21
ApacheSpark
redis叢集資料儲存和獲取原理
2018-12-28
Redis
資料採集和融合技術作業1
2024-10-19
資料採集和融合技術作業3
2024-11-03
php資料集
2019-02-16
PHP
tinyshakespeare資料集
2024-08-04
SST資料集
2021-01-01
使用coco資料集建立賦值黏貼篡改資料集
2021-01-01
賦值
談談大資料採集和常見問題
2022-12-21
大資料
資料採集元件：Flume基礎用法和Kafka整合
2021-03-05
元件Kafka
DML（Data Manipulation Language、資料操作語言），用於新增、刪除、更新和查詢資料庫記
2024-04-02
資料庫
As a reader --> AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models
2024-04-23
[Paper Reading] KOSMOS: Language Is Not All You Need: Aligning Perception with Language Models
2024-03-27
常見資料集
2018-06-14
資料集訓練
2024-03-18
資料集簡介
2020-10-09
大資料工程師入門系列—常用資料採集工具（Flume、Logstash 和 Fluentd）
2021-08-10
大資料工程師
拆分PPOCRLabel標註的資料集並生成識別資料集
2024-10-31
Spartacus Storefront 裡的 currency 和 language 的 store 設計
2023-01-16
【資料集合】並集、交集、差集、子集
2019-06-26
大資料分享Spark任務和叢集啟動流程
2020-06-04
大資料Spark

Stanford Natural Language Inference (SNLI)和Multi-Genre NLI Corpus(MultiNLI) 資料集

相關文章