【預研】搜尋引擎基礎——inverted index(倒排索引)

aganlengzi發表於2016-11-11

基礎知識思考整理
http://blog.csdn.net/aganlengzi/article/details/53130790

inverted index:In computer science, an inverted index (also referred to as postings file or inverted file) is an index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents (named in contrast to a Forward Index, which maps from documents to content).[1]

倒排索引:應該是翻譯的鍋了… … 感覺叫反向索引可能更好一點。
常規的索引是文件到關鍵詞的對映:文件——>關鍵詞
倒排索引是關鍵詞到文件的對映:關鍵詞——>文件
之所以這麼幹是想通過關鍵詞方便快捷地找到相關的文件,是搜尋引擎的重要基礎技術。

關於inverted index的具體原理,這個裡面講得比較清楚。

[1] https://en.wikipedia.org/wiki/Inverted_index
[2] http://blog.csdn.net/malefactor/article/details/7256305
[2] https://www.zhihu.com/question/23202010/answer/23928943

相關文章