1、簡介
Hunspell是一個為擁有多型和複雜組合詞的語言所設計的拼寫檢查器,原本為匈牙利語設計。
Hunspell是一個自由軟體,在GPL、LGPL和MPL三許可證下發行。
Hunspell對主要平臺和程式語言都有介面和封裝。Hunspell基於MySpell,並且與MySpell詞典後端相容。MySpell使用單位元組字元編碼,而Hunspell則可以使用Unicode UTF-8編碼的詞典。
2、以下應用程式使用Hunspell作為拼寫檢查器:
Mac OS X10.6 以及之後版本
Eclipse,使用Hunspell4Eclipse
Google Chrome,Google開發的一個網頁瀏覽器
Evernote,筆記軟體
LibreOffice和OpenOffice.org,開源辦公元件
Mozilla Firefox和Thunderbird以及SeaMonkey
Opera,一個跨平臺的網頁瀏覽器
Scribus,桌面出版應用
Vim,一個文字編輯器
WPS Office,國產辦公元件
3、使用docker映象測試Hunspell的功能:
3.1檢視可用字典
[root@host-10-0-251-159 hunspell]# docker run --rm tmaier/hunspell -D SEARCH PATH: .::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/root/.openoffice.org/3/user/wordbook:/root/.openoffice.org2/user/wordbook:/root/.openoffice.org2.0/user/w/lib/openoffice.org/basis3.0/share/dict/ooo:/opt/openoffice.org2.4/share/dict/ooo:/usr/lib/openoffice.org2.4/share/dict/ooo:/opt/openoffice.org2.3/share/dict/ooo:/usr/lib/openoffice.org2.3/shhare/dict/ooo:/opt/openoffice.org2.1/share/dict/ooo:/usr/lib/openoffice.org2.1/share/dict/ooo:/opt/openoffice.org2.0/share/dict/ooo:/usr/lib/openoffice.org2.0/share/dict/ooo AVAILABLE DICTIONARIES (path is not mandatory for -d option): /usr/share/hunspell/en_CA /usr/share/hunspell/de_DE_comb /usr/share/hunspell/en_ZA /usr/share/hunspell/en_US /usr/share/hunspell/en_GB /usr/share/hunspell/en_AU /usr/share/hunspell/de_CH /usr/share/hunspell/de_DE_neu /usr/share/hunspell/en_NZ /usr/share/hunspell/de_AT /usr/share/hunspell/default LOADED DICTIONARY: /usr/share/hunspell/default.aff /usr/share/hunspell/default.dic Hunspell 1.6.2
3.2檢視幫助資訊
[root@host-10-0-251-159 hunspell]# docker run --rm -v $(pwd):/workdir tmaier/hunspell -u3 -i utf-8 -d de_DE_neu,en_US,de_CH -p words -h Usage: hunspell [OPTION]... [FILE]... Check spelling of each FILE. Without FILE, check standard input. -1 check only first field in lines (delimiter = tabulator) -a Ispell's pipe interface --check-url check URLs, e-mail addresses and directory paths --check-apostrophe check Unicode typographic apostrophe -d d[,d2,...] use d (d2 etc.) dictionaries -D show available dictionaries -G print only correct words or lines -h, --help display this help and exit -H HTML input file format -i enc input encoding -l print misspelled words(只列印錯誤的單詞) -L print lines with misspelled words(列印錯誤單詞所在行) -m analyze the words of the input text -n nroff/troff input file format -O OpenDocument (ODF or Flat ODF) input file format -p dict set dict custom dictionary -r warn of the potential mistakes (rare words) -P password set password for encrypted dictionaries -s stem the words of the input text -S suffix words of the input text -t TeX/LaTeX input file format -v, --version print version number -vv print Ispell compatible version number -w print misspelled words (= lines) from one word/line input. -X XML input file format Example: hunspell -d en_US file.txt # interactive spelling hunspell -i utf-8 file.txt # check UTF-8 encoded file hunspell -l *.odt # print misspelled words of ODF files # Quick fix of ODF documents by personal dictionary creation # 1 Make a reduced list from misspelled and unknown words: hunspell -l *.odt | sort | uniq >words # 2 Delete misspelled words of the file by a text editor. # 3 Use this personal dictionary to fix the deleted words: hunspell -p words *.odt Bug reports: http://hunspell.github.io/
3.3檢查某個文件的拼寫(顯示錯誤詞所在行數及建議更改)原文:test1.TXT(連結:https://pan.baidu.com/s/17JRmtnebLblVsMG05CIm-w 密碼:l3q9)
[root@host-10-0-251-159 hunspell]# docker run --rm -v $(pwd):/workdir tmaier/hunspell -u3 -i utf-8 -d de_DE_neu,en_US,de_CH -p words test1.TXT test1.TXT:7: Locate: rans | Try: rand test1.TXT:15: Locate: wew | Try: woo test1.TXT:23: Locate: Sevenn | Try: Severn test1.TXT:27: Locate: cannt | Try: canny test1.TXT:203: Locate: Hmm | Try: Mm test1.TXT:211: Locate: Lele | Try: Lee test1.TXT:215: Locate: Lele | Try: Lee test1.TXT:243: Locate: Lele | Try: Lee test1.TXT:247: Locate: Lele | Try: Lee test1.TXT:284: Locate: Hmm | Try: Mm test1.TXT:292: Locate: Hmm | Try: Mm test1.TXT:468: Locate: ve | Try: be test1.TXT:500: Locate: ve | Try: be test1.TXT:516: Locate: ve | Try: be test1.TXT:564: Locate: Hmm | Try: Mm test1.TXT:644: Locate: ve | Try: be test1.TXT:776: Locate: hasn | Try: has test1.TXT:921: Locate: isn | Try: sin test1.TXT:945: Locate: ve | Try: be test1.TXT:953: Locate: ve | Try: be test1.TXT:989: Locate: Hmm | Try: Mm test1.TXT:1005: Locate: Hmm | Try: Mm test1.TXT:1085: Locate: wasn | Try: wans test1.TXT:1129: Locate: isn | Try: sin test1.TXT:1145: Locate: isn | Try: sin test1.TXT:1173: Locate: vomeronasal | Try: astronomer test1.TXT:1213: Locate: didn | Try: did test1.TXT:1289: Locate: ve | Try: be test1.TXT:1329: Locate: weren | Try: were test1.TXT:1349: Locate: wasn | Try: wans test1.TXT:1425: Locate: wouldn | Try: would test1.TXT:1425: Locate: weren | Try: were test1.TXT:1470: Locate: ve | Try: be test1.TXT:1495: Locate: ve | Try: be test1.TXT:1803: Locate: cefepime | Try: timepiece test1.TXT:1807: Locate: amikacin | Try: Kamikaze test1.TXT:1819: Locate: Mmm | Try: Mm test1.TXT:1839: Locate: kuai | Try: Kauai test1.TXT:1895: Locate: ve | Try: be test1.TXT:1903: Locate: isn | Try: sin test1.TXT:2012: Locate: ve | Try: be test1.TXT:2096: Locate: aren | Try: earn test1.TXT:2116: Locate: shouldn | Try: should test1.TXT:2168: Locate: whould | Try: would test1.TXT:2232: Locate: Hmm | Try: Mm test1.TXT:2800: Locate: Hmm | Try: Mm test1.TXT:2820: Locate: Hmm | Try: Mm test1.TXT:2930: Locate: ve | Try: be test1.TXT:2993: Locate: Hmm | Try: Mm test1.TXT:2997: Locate: Hmm | Try: Mm test1.TXT:3076: Locate: Uhh | Try: Shh test1.TXT:3331: Locate: Chh | Try: Ch test1.TXT:3376: Locate: Hmm | Try: Mm test1.TXT:3412: Locate: isn | Try: sin test1.TXT:3436: Locate: ve | Try: be test1.TXT:3448: Locate: exfoliator | Try: defoliator test1.TXT:3518: Locate: didn | Try: did test1.TXT:3531: Locate: didn | Try: did test1.TXT:3652: Locate: Hmm | Try: Mm test1.TXT:3696: Locate: ve | Try: be