Python自然語言處理學習筆記(57)：小結

weixin_34293059發表於2011-09-05

原文網址 : https://blog.csdn.net/weixin_34293059/article/details/86457360

Python自然語言處理筆記

6.8 Summary 小結

Modeling the linguistic data found in corpora can help us to understand linguistic patterns, and can be used to make predictions about new language data.

建模語料庫中的語言資料可以幫助我們理解語言模型，並且可以用於進行關於新語言資料的預測。

Supervised classifiers use labeled training corpora to build models that predict the label of an input based on specific features of that input.

監督式分類器使用標籤訓練語料庫來構建模型，預測基於特定要素輸入的所輸入的標籤。

Supervised classifiers can perform a wide variety of NLP tasks, including document classification, part-of-speech tagging, sentence segmentation, dialogue act type identification, and determining entailment relations, and many other tasks.

監督式分類器可以執行很多NLP任務，包括了文件分類，詞性表彰，語句分割，對話行為型別識別，以及確定蘊含關係，以及其他任務。

When training a supervised classifier, you should split your corpus into three datasets: a training set for building the classifier model; a dev-test set for helping select and tune the model's features; and a test set for evaluating the final model's performance.

當徐連一個監督式分類器，你應該把你的語料庫分為三個資料集：用於構造分類模型的訓練及，一個用於幫助選擇和調整模型特性的偏差測試集，以及一個用於評價最終模型效能的測試集。

When evaluating a supervised classifier, it is important that you use fresh data, that was not included in the training or dev-test set. Otherwise, your evaluation results may be unrealistically optimistic.

當評價一個監督式分類器時，重要的是你要使用新鮮的沒有包含在訓練或者偏差測試集中的資料。否則，你的評估結果可能會不切實際地樂觀。

Decision trees are automatically constructed tree-structured flowcharts that are used to assign labels to input values based on their features. Although they're easy to interpret, they are not very good at handling cases where feature values interact in determining the proper label.

決策樹可以自動地構建樹結構的流程圖，用於為輸入變數基於它們的特性賦值。儘管它們可以簡單地解釋，但是它們不適合處理特性值相互影響來決定合適標籤的情況。

In naive Bayes classifiers, each feature independently contributes to the decision of which label should be used. This allows feature values to interact, but can be problematic when two or more features are highly correlated with one another.

在樸素貝葉斯分類器中，每個特性獨立地貢獻來決定哪個標籤應該被使用。它允許特徵值互動，但是當兩個或更多的特性高度地相互對應時將會有問題。

Maximum Entropy classifiers use a basic model that is similar to the model used by naive Bayes; however, they employ iterative optimization to find the set of feature weights that maximizes the probability of the training set.

最大熵分類器使用基本的與樸素貝葉斯相似的模型；不過，它們使用了迭代優化來尋找特性加權集來最大化訓練集的可能性。

Most of the models that are automatically constructed from a corpus are descriptive — they let us know which features are relevant to a given patterns or construction, but they don't give any information about causal relationships between those features and patterns.

大多數從語料庫自動地構建的模型是描述性的——它們讓我們知道哪個特性與給定的模式或結構是相關的，但是它們沒有給出關於這些特性和模式之間的因果關係的任何資訊。

python自然語言處理學習筆記（八）—— 句法分析
2018-11-06
Python自然語言處理筆記
自然語言處理常用資源筆記分享
2021-08-18
自然語言處理筆記
《Python自然語言處理實戰》連結表
2020-10-23
Python自然語言處理
精通Python自然語言處理 3 ：形態學
2018-05-28
Python自然語言處理
自然語言處理中的遷移學習(下)
2019-10-23
自然語言處理遷移學習
自然語言處理中的遷移學習(上)
2019-10-23
自然語言處理遷移學習
機器學習工作坊 - 自然語言處理
2022-04-21
機器學習自然語言處理
精通Python自然語言處理 2 ：統計語言建模
2018-05-28
Python自然語言處理
精通Python自然語言處理 1 ：字串操作
2018-05-28
Python自然語言處理字串
自然語言處理的基本概念--結合spacy軟體的學習
2021-01-02
自然語言處理
《NLP漢語自然語言處理原理與實踐》學習四
2018-09-14
自然語言處理
自然語言處理（NLP）系列（一）——自然語言理解（NLU）
2023-02-01
自然語言處理
自然語言處理NLP（四）
2018-10-03
自然語言處理
自然語言處理(NLP)概述
2018-08-11
自然語言處理
HanLP 自然語言處理 for nodejs
2019-04-24
HanLP自然語言處理NodeJS
hanlp自然語言處理包的基本使用--python
2018-09-28
HanLP自然語言處理Python
《深度學習進階：自然語言處理》中的網址
2020-08-11
深度學習自然語言處理
Python自然語言處理工具
2020-10-20
Python自然語言處理
python呼叫自然語言處理工具hanlp記錄
2018-10-31
Python自然語言處理HanLP
Python自然語言處理實戰（1）：NLP基礎
2018-07-14
Python自然語言處理
如何用Python處理自然語言？（Spacy與Word Embedding）
2018-06-27
Python
[譯] 自然語言處理真是有趣！
2018-08-10
自然語言處理
自然語言處理:分詞方法
2018-03-29
自然語言處理分詞
突破！自然語言強化學習(NLRL)：一個可處理語言反饋的強化學習框架
2024-12-07
強化學習框架
牛津大學xDeepMind自然語言處理第13講語言模型（3）
2018-10-08
自然語言處理模型
自然語言處理背後的資料科學
2019-04-29
自然語言處理資料科學
基於圖深度學習的自然語言處理方法和應用
2022-05-01
深度學習自然語言處理
自然語言處理中的分詞問題總結
2018-10-26
自然語言處理分詞
使用 Python+spaCy 進行簡易自然語言處理
2019-03-03
Python自然語言處理
Python 自然語言處理（基於jieba分詞和NLTK）
2018-05-11
Python自然語言處理Jieba分詞
自然語言處理NLP快速入門
2018-10-24
自然語言處理
配置Hanlp自然語言處理進階
2018-12-07
HanLP自然語言處理
自然語言處理的最佳實踐
2019-10-28
自然語言處理
自然語言處理之jieba分詞
2020-08-18
自然語言處理Jieba分詞
人工智慧 (06) 自然語言處理
2019-12-19
人工智慧自然語言處理
自然語言處理與情緒智慧
2024-08-25
自然語言處理
Pytorch系列:（六）自然語言處理NLP
2021-05-21
PyTorch自然語言處理
史丹佛自然語言處理習題課1——緒論
2018-11-06
自然語言處理
C 語言學習筆記
2020-06-15
筆記

Python自然語言處理學習筆記(57)：小結

相關文章