接著上一篇文章 機器學習之nltk download出錯:Error connecting to server: [Errno -2] ,下面說一下 nltk測試包的安裝及要注意的事項
>>> import nltk
>>> nltk.download()
NLTK Downloader
---------------------------------------------------------------------------
d) Download l) List c) Config h) Help q) Quit
---------------------------------------------------------------------------
Downloader> d
Download which package (l=list; x=cancel)?
Identifier>
---------------------------------------------------------------------------
d) Download l) List c) Config h) Help q) Quit
---------------------------------------------------------------------------
這裡要注意:這一步的時候要選擇l(list)
Downloader> l
Packages:
[ ] brown_tei........... Brown Corpus (TEI XML Version)
[ ] punkt............... Punkt Tokenizer Models
[ ] maxent_treebank_pos_tagger Treebank Part of Speech Tagger (Maximum entropy)
[ ] machado............. Machado de Assis -- Obra Completa
[ ] movie_reviews....... Sentiment Polarity Dataset Version 2.0
[ ] names............... Names Corpus, Version 1.3 (1994-03-29)
[ ] nombank.1.0......... NomBank Corpus 1.0
[ ] nps_chat............ NPS Chat
[ ] paradigms........... Paradigm Corpus
[ ] pe08................ Cross-Framework and Cross-Domain Parser
Evaluation Shared Task
[ ] pil................. The Patient Information Leaflet (PIL) Corpus
[ ] pl196x.............. Polish language of the XX century sixties
[ ] ppattach............ Prepositional Phrase Attachment Corpus
[ ] problem_reports..... Problem Report Corpus
[ ] propbank............ Proposition Bank Corpus 1.0
[ ] qc.................. Experimental Data for Question Classification
[ ] reuters............. The Reuters-21578 benchmark corpus, ApteMod
version
[ ] rte................. PASCAL RTE Challenges 1, 2, and 3
Hit Enter to continue:
檢視所有的包,並找到你需要的包,然後不能按照提示收入點選,而是應該這樣做:
>>> nltk.download('brown_tei')
注意:該方法可能會出現:<urlopen error [Errno -2] Name or service not known>的錯誤,這時可使用下面的方法解決
或者使用:
python -m nltk.downloader spanish_grammars