coreseek,php,mysql全文檢索部署(一)

技術小胖子發表於2017-11-09

安裝環境:

Ubuntu 10.04.4      64位

mysql                    Ver 14.14 Distrib 5.1.69

php                        PHP 5.2.6 (cli)


coreseek安裝需要預裝的軟體:

apt-get install make gcc g++ automake libtool mysql-client libmysqlclient15-dev   libxml2-dev libexpat1-dev


coreseek安裝需要原始碼包:

wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz


安裝完成後的目錄說明:

api:api介面和測試指令碼

etc:配置檔案

etc/pysource:python資料來源指令碼

var:執行資料

var/data:索引檔案

var/log:搜尋日誌

var/test:測試源資料


1.安裝mmseg   (支援中文分詞)

root@CC-57:~# tar zxvf coreseek-3.2.14.tar.gz

root@CC-57:~# cd coreseek-3.2.14/mmseg-3.2.14/

root@CC-57:~/coreseek-3.2.14/mmseg-3.2.14# ./bootstrap         #輸出的warning資訊可以忽略,如果出現error則需要解決


root@CC-57:~/coreseek-3.2.14/mmseg-3.2.14# ./configure –prefix=/usr/local/mmseg3

root@CC-57:~/coreseek-3.2.14/mmseg-3.2.14# make && make install


2.安裝coreseek   支援mysql資料來源和xml資料來源

root@CC-57:~/coreseek-3.2.14/mmseg-3.2.14# cd ../csft-3.2.14/

root@CC-57:~/coreseek-3.2.14/csft-3.2.14# sh buildconf.sh      #輸出的warning資訊可以忽略,如果出現error則需要解決

root@CC-57:~/coreseek-3.2.14/csft-3.2.14# ./configure –prefix=/usr/local/coreseek –without-unixodbc –with-mmseg –with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ –with-mmseg-libs=/usr/local/mmseg3/lib/ –with-mysql=/var/www/dream/mysql/

root@CC-57:~/coreseek-3.2.14/csft-3.2.14# make && make install


可能遇到的問題:


有的系統下可能出現:expected `;` before ‘CSphTokenizer_UTF8SpaceSeg’,

或者出現:configure: WARNING: unrecognized options: –with-mmseg, –with-mmseg-includes, –with-mmseg-libs

是因為你沒有進行隨後的sh buildconf.sh操作

生成當前系統對應的編譯配置檔案

需要使用以下指令:$ sh buildconf.sh

Linux環境下,如遇到pthread問題,請先直接執行以下指令在進行configur:$ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib$ export LD_LIBRARY_PATH

如果出現undefined reference to `libiconv`的類似錯誤,可以按照如下方法處理:

方法一:(Linux使用)## 直接執行:export LIBS=”-liconv”

然後make clean,再次configure後,進行編譯安裝make && make install

方法二:

首先configure,然後vim src/makefile

在其中搜尋lexpat,在其後加上 -liconv

修改後該行應該為:-lexpat -liconv -L/usr/local/lib

然後再次make && make install## 方法三:

首先configure,然後vim config/config.h

在其中搜尋USE_LIBICONV,將其後的1修改為0

然後再次make && make install


3.測試mmseg分詞,coreseek搜尋(需要預先設定好字符集為zh_CN.UTF-8,確保正確顯示中文)

root@CC-57:~/coreseek-3.2.14/testpack# export LANG=zh_CN.UTF-8       #設定系統字符集

root@CC-57:~/coreseek-3.2.14/testpack# locale

LANG=zh_CN.UTF-8

LANGUAGE=en_US:en

LC_CTYPE=”zh_CN.UTF-8″

LC_NUMERIC=”zh_CN.UTF-8″

LC_TIME=”zh_CN.UTF-8″

LC_COLLATE=”zh_CN.UTF-8″

。。。。

root@CC-57:~/coreseek-3.2.14/csft-3.2.14# cd ../testpack/

root@CC-57:~/coreseek-3.2.14/testpack# cat var/test/test.xml           #此時應該正確顯示中文

root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc/ var/test/test.xml


root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/indexer -c etc/csft.conf –all

報錯:

/usr/local/coreseek/bin/indexer: error while loading shared libraries: libmysqlclient.so.15: cannot open shared object file: No such file or directory

解決辦法:

root@CC-57:~/coreseek-3.2.14/testpack# ln -s /var/www/dream/mysql/lib/mysql/libmysqlclient.so.15 /usr/lib/

繼續執行:

root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/indexer -c etc/csft.conf –all

##以下為正常索引全部資料時的提示資訊:(csft-4.0版類似)

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)

using config file `etc/csft.conf`…

indexing index `xml`…

collected 3 docs, 0.0 MB

sorted 0.0 Mhits, 100.0% done

total 3 docs, 7585 bytes

total 0.010 sec, 739134 bytes/sec, 292.34 docs/sec

total 2 reads, 0.000 sec, 4.2 kb/call avg, 0.0 msec/call avg

total 7 writes, 0.000 sec, 3.1 kb/call avg, 0.0 msec/call avg


root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/indexer -c etc/csft.conf var/test/test.xml

##以下為正常索引指定資料時的提示資訊:(csft-4.0版類似)

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


using config file `etc/csft.conf`…

WARNING: no such index `var/test/test.xml`, skipping.

total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg



root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/search -c etc/csft.conf

##以下為正常測試搜尋時的提示資訊:(csft-4.0版類似)

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)

using config file `etc/csft.conf`…

index `xml`: query “: returned 3 matches of 3 total in 0.004 sec

displaying matches:

1. document=1, weight=1, published=Thu Apr  1 22:20:07 2010, author_id=1

2. document=2, weight=1, published=Thu Apr  1 23:25:48 2010, author_id=1

3. document=3, weight=1, published=Thu Apr  1 12:01:00 2010, author_id=2

words:


root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/search -c etc/csft.conf -a Twittter

##以下為正常測試搜尋關鍵詞時的提示資訊:(csft-4.0版類似)

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


using config file `etc/csft.conf`…

index `xml`: query `Twittter `: returned 1 matches of 1 total in 0.014 sec


displaying matches:

1. document=2, weight=1, published=Thu Apr  1 23:25:48 2010, author_id=1


words:

1. `twittter`: 1 documents, 3 hits



root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/searchd -c etc/csft.conf

##以下為正常開啟搜尋服務時的提示資訊:(csft-4.0版類似)

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


using config file `etc/csft.conf`…

listening on all interfaces, port=9312


root@CC-57:~/coreseek-3.2.14/testpack# netstat -nuptl | grep :9312

tcp        0      0 0.0.0.0:9312            0.0.0.0:*               LISTEN      21818/searchd

如要停止搜尋服務:

root@CC-57:~/coreseek-3.2.14/testpack# /usr/local/coreseek/bin/searchd -c etc/csft.conf –stop

Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]

Copyright (c) 2007-2011,

Beijing Choice Software Technologies Inc (http://www.coreseek.com)


using config file `etc/csft.conf`…

stop: succesfully sent SIGTERM to pid 21818


4.通過以上步驟

coreseek已經安裝測試完成,可以提供正常的xml資料來源索引以及提供對應的搜尋服務了



     本文轉自ljl_19880709 51CTO部落格,原文連結:http://blog.51cto.com/luojianlong/1345678,如需轉載請自行聯絡原作者






相關文章