NLP1 —— Python自然語言處理環境搭建

weixin_34162629發表於2017-09-21

最近開始研究自然語言處理了,所以準備好好學習一下,就跟著《Python自然語言處理》這本書,邊學邊整理吧

安裝

Mac裡面自帶了python2.7,所以直接安裝nltk就可以了。

預設執行sudo pip install -U nltk會報錯:

Collecting nltk
  Downloading nltk-3.2.4.tar.gz (1.2MB)
    100% |████████████████████████████████| 1.2MB 555kB/s 
Collecting six (from nltk)
  Downloading six-1.11.0-py2.py3-none-any.whl
Installing collected packages: six, nltk
  Found existing installation: six 1.4.1
    DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
    Uninstalling six-1.4.1:

這是因為系統內部已經有six包了,不能被修改。所以可以跳過six,直接安裝nltk

sudo pip install -U nltk --ignore-installed six

這樣可以看到輸出:

Collecting nltk
  Downloading nltk-3.2.4.tar.gz (1.2MB)
    100% |████████████████████████████████| 1.2MB 552kB/s 
Collecting six
  Downloading six-1.11.0-py2.py3-none-any.whl
Installing collected packages: six, nltk
  Running setup.py install for nltk ... done

測試一下:

xingoodeMacBook-Pro:~ xingoo$ python
Python 2.7.10 (default, Feb  7 2017, 00:08:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk

沒有錯誤,說明安裝成功了。

下載資料集

然後就可以下載資料集了,執行命令nltk.download()彈出下載對話方塊。點選下載就可以用nltk為我們提供的語料庫了。

449064-20170921180246900-616800649.png

參考

《python自然語言處理》

相關文章