參賽背景
比賽介紹
基於魯棒性最佳化的多模型融合的是非觀點極性分析方法
Yes:肯定觀點,肯定觀點指的是答案給出了較為明確的肯定態度。有客觀事實的從客觀事實的角度出發,主觀態度類的從答案的整體態度來判斷。 No:否定觀點,否定觀點通常指的是答案較為明確的給出了與問題相反的態度。 Depends:無法確定/分情況,主要指的是事情本身存在多種情況,不同情況下對應的觀點不一致;或者答案本身對問題表示不確定,要具體具體情況才能判斷。
BERT [7] : 使用Transformer [10] 作為演算法的主要框架,更徹底地捕捉語義關係,使用了Mask Language Model(MLM) [11] 和 Next Sentence Prediction(NSP) 的多工訓練目標,相較於較早的預訓練模型,BERT使用更強大的算力訓練了更大規模的資料。 RoBERTa [1] : 相較BERT,RoBERTa不再使用Next Sentence Prediction(NSP)任務,使用更大更多樣性的資料,且資料從一個文件中連續獲得。在mask方面,使用動態掩碼機制每次向模型輸入一個序列時都會生成新的掩碼模式。這樣,在大量資料不斷輸入的過程中,模型會逐漸適應不同的掩碼策略,學習不同的語言表徵。 ERNIE [6] : 在BERT的基礎上做最佳化,主要改進點在於,在pretrainning階段增加了外部的知識,由三種level的mask組成,分別是basic-level masking(word piece)+ phrase level masking(WWM style) + entity level masking,引入了DLM (Dialogue Language Model) task,中文的ERNIE還使用了各種異構資料集。
實驗分析
結論
Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J]. arXiv preprint arXiv:1907.11692, 2019.
https://github.com/PaddlePaddle/PALM He W, Liu K, Liu J, et al. Dureader: a chinese machine reading comprehension dataset from real-world applications[J]. arXiv preprint arXiv:1711.05073, 2017. Rajpurkar P, Zhang J, Lopyrev K, et al. Squad: 100,000+ questions for machine comprehension of text[J]. arXiv preprint arXiv:1606.05250, 2016. https://github.com/PaddlePaddle/Paddle Sun Y, Wang S, Li Y, et al. Ernie: Enhanced representation through knowledge integration[J]. arXiv preprint arXiv:1904.09223, 2019. Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018. Miyato T, Dai A M, Goodfellow I. Adversarial training methods for semi-supervised text classification[J]. arXiv preprint arXiv:1605.07725, 2016. Dietterich T G. Ensemble methods in machine learning[C]//International workshop on multiple classifier systems. Springer, Berlin, Heidelberg, 2000: 1-15. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in neural information processing systems. 2017: 5998-6008. Taylor W L. “Cloze procedure”: A new tool for measuring readability[J]. Journalism quarterly, 1953, 30(4): 415-433. Ross A S, Doshi-Velez F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients[C]//Thirty-second AAAI conference on artificial intelligence. 2018. Sun Y, Wang S, Li Y, et al. Ernie 2.0: A continual pre-training framework for language understanding[J]. arXiv preprint arXiv:1907.12412, 2019. Wei J, Ren X, Li X, et al. NEZHA: Neural Contextualized Representation for Chinese Language Understanding[J]. arXiv preprint arXiv:1909.00204, 2019. Yang Z, Dai Z, Yang Y, et al. Xlnet: Generalized autoregressive pretraining for language understanding[C]//Advances in neural information processing systems. 2019: 5754-5764. Lan Z, Chen M, Goodman S, et al. Albert: A lite bert for self-supervised learning of language representations[J]. arXiv preprint arXiv:1909.11942, 2019. Chen T, Guestrin C. Xgboost: A scalable tree boosting system[C]//Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016: 785-794.