用python實現的可以自動補全的字首樹

hh_good_dog發表於2016-12-23

原文網址 : https://blog.csdn.net/u010541796/article/details/53836842

1，以下是程式碼部分

import os,sys
import json
class TrieTree:
    def __init__(self,is_debug=1,is_sentence=0):
        self.tree = None
        self.tree = {}
        self.is_debug = is_debug
        self.is_sentence = is_sentence
        self.prefix_list = []
    def addFromFile(self,filePath):
        with open(filePath) as f:
            for line in f:
                line_list = line.strip().strip("#").split("#")
                main_word = line_list[0].strip().split()
                if not self.is_sentence:
                    sub_word_list = [
                        u.replace(" ","") for u in line_list
                    ]
                else:
                    sub_word_list = line_list

                for i,w in enumerate(main_word):
                    if i == 0:
                        target_dict = self.tree
                    else:
                        target_dict = target_dict[main_word[i-1]]
                    if w not in target_dict:
                        target_dict[w] = {}
                        target_dict[w]["##cnt"] = 1
                        target_dict[w]["##terminal"] = []
                        target_dict[w]["##wordTag"] = 0
                    else:
                        target_dict[w]["##cnt"] += 1
                    if i== len(main_word)-1:
                        target_dict[w]["##terminal"].extend(sub_word_list)
                        target_dict[w]["##wordTag"] = 1
        if self.is_debug:
            context = json.dumps(self.tree,indent=2,ensure_ascii=False)
            print>>file("./debug.json","w"),context
    def searchPrefix(self,prefix_string):
        self.prefix_list = []
        target_dict = self.tree
        if not self.tree:
            return self.prefix_list
        if self.is_sentence:
            prefix_string = prefix_string.strip().split(" ")
        for i,w in enumerate(prefix_string):
            if w not in target_dict:
                return self.prefix_list
            else: 
                target_dict = target_dict[w]
        def deepSearch(target_dict):
            if len(target_dict.keys())==3:
                self.prefix_list.extend(target_dict["##terminal"])
                return
            else:
                self.prefix_list.extend(target_dict["##terminal"])
                for k in target_dict.keys():
                    if k not in ["##terminal","##cnt","##wordTag"]:
                        deepSearch(target_dict[k])
        deepSearch(target_dict)
        return self.prefix_list



if __name__ == "__main__":
    trie = TrieTree(is_debug=1,is_sentence=1)
    trie.addFromFile(sys.argv[1])
    while 1:
        raw=raw_input("Please input:")
        print trie.searchPrefix(raw)

2，以下是測試用例部分，將下面的英文句子貼上到一個檔名字是sent.d中；

Hi, my name is Steve.#
It’s nice to meet you.#
It’s a pleasure to meet you I’m Jack.#
What do you do for a living.#
I work at a restaurant.#
I work at a bank.#
I work in a software company.#
I’m a dentist.#
What is your name.#
What was that again.#
Excuse me.#
Pardon me.#
Are you ready?#
Are you free now?#
Are you Mr. Murthy?#
Are you angry with me?#
Are you afraid of them?#
Are you tired?#
Are you married?#
Are you employed?#
Are you interested in that?#
Are you awake?#
Are you aware of that?#
Are you a relative of Mr. Mohan?#
Are you not well?#
Are they your relatives?#
Are they from abroad?#
Are the shops open?#
Are you satisfied now?#
Are you joking?#

3，測試過程
在linux shell中執行：
python trieTree.py sent.d
即可輸入一個完整的單詞字首進行查詢了！

** 這裡你可能會有疑問，這個演算法只能是按照字首搜尋，即
按照2裡面的例子來看，輸入Are，只能得到一Are 開頭的句子，輸入Are you 只能得到以Are you 開頭的句子，如果我想知道所有含有單詞shops的句子呢？該如何處理，這個時候 “字尾樹”就會發揮作用了，名字為字尾樹，實則不然，其實是把所有句子的字尾單元都壓入到一個字首樹中，例如
Are you a lucky dog？
這個句子的所有的字尾就是
Are you a lucky dog?
you lucky dog?
lucky dog?
dog?
把每個句子的所有的字尾都壓入到字首樹中，那麼是不是就會很方便的查詢到含有某個單詞的所有句子了呢？

208. 實現 Trie (字首樹)-python
2024-05-06
Python
python之自動補全 tab
2018-12-29
Python
kubectl的自動補全
2022-04-18
Python如何設定自動補全?
2023-11-15
Python
教你用Python實現全自動刷網課
2022-03-22
Python
字典樹(字首樹)簡單實現
2020-11-15
字首樹及其Java實現
2021-09-13
Java
完善 VSCode 的 Node 自動補全
2018-04-23
VSCode
Docker 命令自動補全？要的
2020-12-09
Docker
vscode怎麼設定html標籤自動補全? vscode自動補全html的技巧
2020-09-23
VSCodeHTML
Python 類的動態屬性、動態方法在 IDE 裡面自動補全；這個有辦法實現嗎？
2020-10-14
PythonIDE
mac git 自動補全
2018-08-02
MacGit
pycharm 如何自動補全
2021-09-11
PyCharm
備份Outlook 2016 的自動補全列表
2022-05-23
Springboot實現基於字首的自定義配置和自動提示功能
2019-05-11
Spring Boot
Docker自動補全容器名
2018-04-30
Docker
[譯] 自動補全規則
2019-05-10
終端自動補全命令
2023-02-08
Qt：透過QCompleter類提供的補全框completion box，根據使用者輸入的內容提供可能的補全選項實現自動補全功能
2024-04-19
QT
Vue中實現輸入框的自動補全功能
2024-04-09
Vue
用 Python（PyVISA）實現儀器自動化
2024-04-22
Python
用python實現selenium 自動化測試
2020-11-09
Python
Vim的snipMate外掛 php程式碼自動補全
2020-04-07
PHP
Laravel artisan bash 命令自動補全
2020-04-10
Laravel
CSS 也能自動補全字串？
2022-03-21
CSS字串
fish shell 自動補全子命令
2021-04-22
用Python實現二叉樹的增、刪、查
2019-01-23
Python二叉樹
Python自動登入QQ的實現示例
2020-11-23
Python
Redis 實戰 —— 08. 實現自動補全、分散式鎖和計數訊號量
2021-01-27
Redis分散式
pycharm 取消空格,逗號等符號的自動補全
2018-11-07
PyCharm符號
ORACLE RAC的全自動打補丁標準化文件
2022-12-02
Oracle
【Python學習教程】Python程式設計可以實現哪些辦公自動化?
2021-12-14
Python程式設計
字首樹
2024-09-26
Autocomplete自動補全元件-HeyUI元件庫
2019-03-02
元件UI
MySQL資料庫自動補全命令
2021-03-09
MySql資料庫
按鍵大師：用Python實現無人值守的自動化操作
2024-06-28
Python
樹莓派上利用Tensorflow實現小車的自動駕駛
2019-01-17
樹莓派自動駕駛
華為快應用IDE：程式碼智慧提示及自動補全
2018-08-10
IDE
Vim中設定括號自動補全
2018-08-31

用python實現的可以自動補全的字首樹

相關文章