文字太長,你可以直接看程式碼:
https://github.com/lijinma/laravel-scout-e...
過年的時候,我在家寫了一個小網站,名字叫“笑來搜”,整個過程是這樣的:
- 開始使用
tntsearch
,非常小巧,依賴也少,很喜歡。 - 不過用了一下發現
tntsearch
沒有配套的中文分詞,有一個小夥子寫了一個,但是很不完善。 - 最終還是選擇了
ElasticSearch
,雖然相對tntsearch
更重一點。 ElasticSearch
中的ik
分詞外掛簡單好用,而且非常容易擴充套件詞庫。
笑來搜 上線後,好幾個朋友詢問如何可以簡單的實現一個類似的搜尋網站,所以我就抽時間做了一個類似的 Demo,程式碼在 https://github.com/lijinma/laravel-scout-e... ,對你有幫助的請 Star,這個 Demo 至少有這兩個優點:
- 儘可能寫清楚安裝中的每一個步驟,我假設你是一名新手。
- 這個 Demo 直接跑在了我的伺服器上,你可以直觀的玩起來。http://scout.lijinma.com/search
下面是整個教程:
首先:我們要做一個什麼?
我們要做的東西比較簡單,就是把一個公眾賬號的文章拉下來,然後實現所有文章的“標題”和“內容”的搜尋,在專案中我選擇了李笑來老師的”學習學習再學習“中的50篇文章。
先看看要做的東西的樣子: http://scout.lijinma.com/search
第一步:安裝好 Laravel 5.4
不管你是使用 homestead,還是 valet,還是 docker ,還是直接自己本地環境搭建,反正第一步你要把 Laravel 5.4 專案跑起來,可以看到 welcome 的頁面。
這裡分享一下我是如何開發的,一般來說,只有我一個人開發的簡單的 Laravel 專案,我都不使用 homestead 或者 valet 或者 docker 跑的,我直接在 Mac 本地跑,Mac 上只需要裝一個 mysql,然後開發除錯的時候直接使用
php artisan serve
,總體來說效率比較高,配置快。
第二步:配置
配置資料庫
create database laravel_scout_elastic_demo;
安裝 ElasticSearch Scout Engine 包
$ composer require tamayo/laravel-scout-elastic
安裝這個包的時候,順便就會裝好 Laravel Scout,我們 publish 一下 config
$ php artisan vendor:publish --provider="Laravel\Scout\ScoutServiceProvider"
新增對應的 ServiceProvider:
//app.php
...
Laravel\Scout\ScoutServiceProvider::class,
ScoutEngines\Elasticsearch\ElasticsearchProvider::class,
...
安裝 Goutte Client
我們需要透過公眾號文章的 url 爬到文章的標題和內容,所以需要安裝這個 庫:
composer require fabpot/goutte
第三步:安裝 ElasticSearch
因為我們要使用 ik 外掛,在安裝這個外掛的時候,如果自己想辦法安裝這個外掛會浪費你很多精力。
所以我們直接使用專案: https://github.com/medcl/elasticsearch-rtf
當前的版本是 Elasticsearch 5.1.1,ik 外掛也是直接自帶了。
安裝好 ElasticSearch,跑起來服務,測試服務安裝是否正確:
$ curl http://localhost:9200
{
"name" : "Rkx3vzo",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "Ww9KIfqSRA-9qnmj1TcnHQ",
"version" : {
"number" : "5.1.1",
"build_hash" : "5395e21",
"build_date" : "2016-12-06T12:36:15.409Z",
"build_snapshot" : false,
"lucene_version" : "6.3.0"
},
"tagline" : "You Know, for Search"
}
如果正確的列印以上資訊,證明 ElasticSearch 已經安裝好了。
接著你需要檢視一下 ik 外掛是否安裝(請在你的 ElasticSearch 資料夾中執行):
$ ./bin/elasticsearch-plugin list
analysis-ik
如果出現 analysis-ik
,證明 ik 已經安裝。
第四步,開始寫程式碼:
新增 InitEs 命令,初始化 ES 的一些資料
$ php artisan make:command InitEs
InitEs.php 程式碼如下,主要做了兩件事情:
- 建立對應的 index
- 建立一個 template,你可以透過下面的連結瞭解一下什麼是 Index template
https://www.elastic.co/guide/en/elasticsea...
<?php
namespace App\Console\Commands;
use GuzzleHttp\Client;
use Illuminate\Console\Command;
class InitEs extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'es:init';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Init es to create index';
/**
* Create a new command instance.
*
*/
public function __construct()
{
parent::__construct();
}
/**
* Execute the console command.
*
* @return mixed
*/
public function handle()
{
$client = new Client();
$this->createTemplate($client);
$this->createIndex($client);
}
protected function createIndex(Client $client)
{
$url = config('scout.elasticsearch.hosts')[0] . ':9200/' . config('scout.elasticsearch.index');
$client->put($url, [
'json' => [
'settings' => [
'refresh_interval' => '5s',
'number_of_shards' => 1,
'number_of_replicas' => 0,
],
'mappings' => [
'_default_' => [
'_all' => [
'enabled' => false
]
]
]
]
]);
}
protected function createTemplate(Client $client)
{
$url = config('scout.elasticsearch.hosts')[0] . ':9200/' . '_template/rtf';
$client->put($url, [
'json' => [
'template' => '*',
'settings' => [
'number_of_shards' => 1
],
'mappings' => [
'_default_' => [
'_all' => [
'enabled' => true
],
'dynamic_templates' => [
[
'strings' => [
'match_mapping_type' => 'string',
'mapping' => [
'type' => 'text',
'analyzer' => 'ik_smart',
'ignore_above' => 256,
'fields' => [
'keyword' => [
'type' => 'keyword'
]
]
]
]
]
]
]
]
]
]);
}
}
建立 Post 表,存放公眾號的文章
php artisan make:migration create_posts_table
程式碼:
<?php
use Illuminate\Support\Facades\Schema;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Database\Migrations\Migration;
class CreatePostsTable extends Migration
{
/**
* Run the migrations.
*
* @return void
*/
public function up()
{
Schema::create('posts', function (Blueprint $table) {
$table->increments('id');
$table->text('url');
$table->string('author', 64)->nullable()->default(null);
$table->text('title');
$table->longText('content');
$table->dateTime('post_date')->nullable()->default(null);
$table->timestamps();
});
}
/**
* Reverse the migrations.
*
* @return void
*/
public function down()
{
Schema::dropIfExists('posts');
}
}
在資料庫中建立表:
$ php artisan migrate
新增 Post Model:
$ php artisan make:model Post
程式碼:
<?php
namespace App;
use Illuminate\Database\Eloquent\Model;
use Laravel\Scout\Searchable;
/**
* Class Post
* @package App
* @property string $url
* @property string $author
* @property string $content
* @property string $title
* @property string $post_date
* @property string $created_at
* @property string $updated_at
*/
class Post extends Model
{
use Searchable;
protected $table = 'posts';
protected $fillable = [
'url',
'author',
'title',
'content',
'post_date'
];
public function toSearchableArray()
{
return [
'title' => $this->title,
'content' => $this->content
];
}
}
新增一個命令 ImportPosts,透過此命令去爬去資料,並匯入到 Post 表中。
$ php artisan make:command ImportPosts
程式碼:
<?php
namespace App\Console\Commands;
use App\Libraries\WechatPostSpider;
use App\Post;
use Goutte\Client;
use Illuminate\Console\Command;
class ImportPosts extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'posts:import';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Import posts!';
/**
* Create a new command instance.
*
*/
public function __construct()
{
parent::__construct();
}
/**
* Execute the console command.
*
* @return mixed
*/
public function handle()
{
$client = new Client();
foreach (config('post-urls') as $url) {
/**
* 這裡 url 可能需要索引,但是用 url 做唯一標示不太好,索引太大
*/
if (Post::where('url', $url)->exists()) {
continue;
}
$wechatPostSpider = new WechatPostSpider($client, $url);
$this->savePost($wechatPostSpider);
$this->info('create one post!');
}
}
protected function savePost(WechatPostSpider $wechatPostSpider)
{
Post::create([
'url' => $wechatPostSpider->getUrl(),
'author' => $wechatPostSpider->getAuthor(),
'title' => $wechatPostSpider->getTitle(),
'content' => $wechatPostSpider->getContent(),
'post_date' => $wechatPostSpider->getPostDate(),
]);
}
}
此時,需要依賴兩個檔案,一個是 app/Libraries/WechatPostSpider.php,一個是 config/post-urls.php 配置檔案。
WechatPostSpider.php 負責爬去資料
<?php namespace App\Libraries;
use Goutte\Client;
use Symfony\Component\DomCrawler\Crawler;
/**
* Created by PhpStorm.
* User: lijinma
* Date: 04/03/2017
* Time: 9:05 PM
*/
class WechatPostSpider
{
/**
* @var Crawler|null
*/
protected $crawler;
/**
* @var string
*/
protected $url;
/**
* WechatPostSpider constructor.
* @param Client $client
* @param $url
*/
public function __construct(Client $client, $url)
{
$this->url = $url;
$this->crawler = $client->request('GET', $url);
}
/**
* @return string
*/
public function getTitle()
{
return trim($this->crawler->filter('title')->text());
}
/**
* @return string
*/
public function getContent()
{
return trim($this->crawler->filter('.rich_media_content')->text());
}
/**
* @return string
*/
public function getAuthor()
{
return trim($this->crawler->filter('#post-date')->nextAll()->text());
}
/**
* @return string
*/
public function getPostDate()
{
return $this->crawler->filter('#post-date')->text();
}
/**
* @return string
*/
public function getUrl()
{
return $this->url;
}
}
post-urls.php 儲存需要爬取的公眾號文章 urls,這裡只列了一條
<?php
return [
"http://mp.weixin.qq.com/s?__biz=MzAxNzI4MTMwMw==&mid=2651630953&idx=1&sn=9c4d8f2b4df2605fdaa1338303acc908&chksm=801ff511b7687c07303220a0c105d979f1a4a5db45689c95111a6c6ec2f5a6c0c6cecea88ba0&scene=4#wechat_redirect",
];
新增 PostController
$ php artisan make:controller PostController
PostController.php 程式碼:
<?php
namespace App\Http\Controllers;
use App\Post;
use Illuminate\Http\Request;
class PostController extends Controller
{
public function search(Request $request)
{
$q = $request->get('q');
$paginator = [];
if ($q) {
$paginator = Post::search($q)->paginate();
}
return view('search', compact('paginator', 'q'));
}
}
PostController.php 需要依賴 view 檔案,我們建立一個 resources/views/layouts/main.blade.php,一個 resources/views/search.blade.php
resources/views/layouts/main.blade.php 程式碼:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" id="viewport"
content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1"/>
<!-- CSRF Token -->
<meta name="csrf-token" content="{{ csrf_token() }}">
<title>{{ config('app.name', 'Laravel') }}</title>
<!-- Styles -->
<link href="https://cdn.bootcss.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet">
<link href="/css/main.css" rel="stylesheet">
<!-- Scripts -->
<script>
window.Laravel = {!! json_encode([
'csrfToken' => csrf_token(),
]) !!};
</script>
</head>
<body>
<div id="app">
<div class="container">
<div class="row">
<div class="col-md-12">
<nav class="navbar navbar-default">
<div class="container-fluid">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">Laravel Scout Elastic Demo</a>
</div>
</div><!-- /.container-fluid -->
</nav>
</div>
</div>
@yield('content')
</div>
</div>
<!-- Scripts -->
<script src="http://cdn.bootcss.com/jquery/1.12.4/jquery.min.js"></script>
<script src="http://cdn.bootcss.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</body>
</html>
resources/views/search.blade.php 程式碼:
@extends('layouts.main')
@section('content')
<div class="row">
<div class="col-md-12">
<form action="/search">
<div class="input-group">
<input type="text" class="form-control h50" name="q" placeholder="關鍵字..." value="{{ $q }}">
<span class="input-group-btn"><button class="btn btn-default h50" type="submit" type="button"><span class="glyphicon glyphicon-search"></span></button></span>
</div>
</form>
</div>
</div>
@if($q)
<div class="row">
<div class="col-md-12">
<div class="panel panel-default list-panel search-results">
<div class="panel-heading">
<h3 class="panel-title ">
<i class="fa fa-search"></i> 關於 “<span class="highlight">{{ $q }}</span>” 的搜尋結果, 共 {{ $paginator->total() }} 條
</h3>
</div>
<div class="panel-body ">
@foreach($paginator as $post)
<div class="result">
<h2 class="title">
<a href="{{ $post->url }}" target="_blank">
{{ $post->title }}
</a>
</h2>
<div class="info">
</div>
<div class="desc">
{{ mb_substr($post->content, 0, 150) }}......
</div>
<hr>
</div>
@endforeach
</div>
{{ $paginator->links() }}
</div>
</div>
</div>
@else
<div class="row text-center">
<div class="col-md-12">
<br>
<h2>你會搜尋到什麼?</h2>
<br>
<p>學習學習再學習公眾號所有文章</p>
</div>
</div>
@endif
@endsection
現在我們的程式碼已經寫完了,但是缺少一個功能,搜尋結果如何高亮(highlight) 呢?
本作品採用《CC 協議》,轉載必須註明作者和本文連結