标签:sea request getview function map out node params nat
在之前的博客中,写了一篇用laravel5.5和vue写的个人博客。GitHub地址为:https://github.com/Johnson19900110/phpJourney。最近有空,就想着把Elasticsearch集成了进来。
因为博主比较懒,在博客园写博客,所以个人博客就没有同步了,因此就用php的一个爬虫库 fabpot/goutte 把自己博客园文章爬到了自己博客上。
代码如下:
<?php namespace App\Libraries; use App\Post; use Goutte\CLient; use Symfony\Component\DomCrawler\Crawler; class CnblogsPostSpider { protected $client; protected $crawler; protected $urls = []; public function __construct(Client $client, $url) { $this->client = $client; $this->crawler = $client->request(‘GET‘, $url); } public function getUrls() { $urls = $this->crawler->filter(‘.postTitle > a‘)->each(function ($node) { return $node->attr(‘href‘); }); foreach ($urls as $url) { $crawler = $this->client->request(‘GET‘, $url); $cnBlogId = $this->getCnBlogId($url); $post = new Post(); if($post->where(‘cnblogs_id‘, $cnBlogId)->count()) { // 已爬过该博客,只更新阅读和评论数 $post->where(‘cnblogs_id‘, $cnBlogId)->update([ ‘views‘ => $this->getViews($crawler), ‘comments‘ => $this->getComments($crawler), ]); }else { $post->insert([ ‘title‘ => $this->getTitle($crawler), ‘category_id‘ => 1, ‘content‘ => $this->getContent($crawler), ‘user_id‘ => 1, ‘views‘ => $this->getViews($crawler), ‘comments‘ => $this->getComments($crawler), ‘cnblogs_id‘ => $cnBlogId, ‘cnblogs_url‘ => $url, ‘created_at‘ => $this->getCreatedAt($crawler), ]); } } } public function getCnBlogId($url) { $url_arr = explode(‘/‘, $url); $last = array_pop($url_arr); $path_arr = explode(‘.‘, $last); return intval(array_shift($path_arr)); } protected function getTitle(Crawler $crawler) { return trim($crawler->filter(‘.postTitle > a‘)->text()); } protected function getContent(Crawler $crawler) { return trim($crawler->filter(‘#cnblogs_post_body‘)->text()); } protected function getViews(Crawler $crawler) { return intval(trim($crawler->filter(‘#post_view_count‘)->text())); } protected function getComments(Crawler $crawler) { return intval($crawler->filter(‘#post_comment_count‘)->text()); } protected function getCreatedAt(Crawler $crawler) { return trim($crawler->filter(‘#post-date‘)->text()); } }
然后开始使用Laravel scout 集成ES:
首先,先下载ES包:
composer require tamayo/laravel-scout-elastic
这个包依赖 Laravel scout包,所以也就顺便装好了。
然后 publish config 和添加 ServiceProviders 。
这时候就可以装ES了。因为我们要使用中文分词 ik 插件,在安装ik插件的时候,如果我们自己取想办法安装会浪费你很多精力。
因为博主也是刚接触ES,所以我们直接使用现成的项目: https://github.com/medcl/elasticsearch-rtf。
这个项目当前的版本是 Elasticsearch 5.1.1,当然ik 插件也就顺便装好了。
$ curl http://localhost:9200 { "name" : "Rkx3vzo", "cluster_name" : "elasticsearch", "cluster_uuid" : "Ww9KIfqSRA-9qnmj1TcnHQ", "version" : { "number" : "5.1.1", "build_hash" : "5395e21", "build_date" : "2016-12-06T12:36:15.409Z", "build_snapshot" : false, "lucene_version" : "6.3.0" }, "tagline" : "You Know, for Search" }
当你出现这个界面,说明ES已经装好了。
这时候就可以创建一个 artisan 命令,来创建ES的index和template。
<?php namespace App\Console\Commands; use GuzzleHttp\Client; use Illuminate\Console\Command; class InitEs extends Command { /** * The name and signature of the console command. * * @var string */ protected $signature = ‘es:init‘; /** * The console command description. * * @var string */ protected $description = ‘Init es to create index‘; /** * Create a new command instance. * * @return void */ public function __construct() { parent::__construct(); } /** * Execute the console command. * * @return mixed */ public function handle() { // $client = new Client(); $this->createTemplate($client); $this->createIndex($client); } public function createTemplate(Client $client) { $url = config(‘scout.elasticsearch.hosts‘)[0] . ‘:9200/‘ . ‘_template/rtf‘; $client->put($url, [ ‘json‘ => [ ‘template‘ => ‘*‘, ‘settings‘ => [ ‘number_of_shards‘ => 1 ], ‘mappings‘ => [ ‘_default_‘ => [ ‘_all‘ => [ ‘enabled‘ => true ], ‘dynamic_templates‘ => [ [ ‘strings‘ => [ ‘match_mapping_type‘ => ‘string‘, ‘mapping‘ => [ ‘type‘ => ‘text‘, ‘analyzer‘ => ‘ik_smart‘, ‘ignore_above‘ => 256, ‘fields‘ => [ ‘keyword‘ => [ ‘type‘ => ‘keyword‘ ] ] ] ] ] ] ] ] ] ]); } public function createIndex(Client $client) { $url = config(‘scout.elasticsearch.hosts‘)[0] . ‘:9200/‘ . config(‘scout.elasticsearch.index‘); $client->put($url, [ ‘json‘ => [ ‘settings‘ => [ ‘refresh_interval‘ => ‘5s‘, ‘number_of_shards‘ => 1, ‘number_of_replicas‘ => 0, ], ‘mappings‘ => [ ‘_default_‘ => [ ‘_all‘ => [ ‘enabled‘ => false ] ] ] ] ]); } }
因为 tamayo/laravel-scout-elastic 不带 highlight 功能,所以我们需要稍微修改一下。新建一个EsEngine继承ElasticsearchEngine类,然后重写几个方法即可。
<?php /** * Created by PhpStorm. * User: johnson * Date: 2018/6/14 * Time: 下午3:10 */ namespace App\Libraries; use Laravel\Scout\Builder; use ScoutEngines\Elasticsearch\ElasticsearchEngine; use Illuminate\Database\Eloquent\Collection; class EsEngine extends ElasticsearchEngine { public function search(Builder $builder) { return $this->performSearch($builder, array_filter([ ‘numericFilters‘ => $this->filters($builder), ‘size‘ => $builder->limit, ])); } protected function performSearch(Builder $builder, array $options = []) { $params = [ ‘index‘ => $this->index, ‘type‘ => $builder->model->searchableAs(), ‘body‘ => [ ‘query‘ => [ ‘bool‘ => [ ‘must‘ => [ [ ‘query_string‘ => [ ‘query‘ => "*{$builder->query}*", ] ] ] ] ], ] ]; /** * 这里使用了 highlight 的配置 */ if ($builder->model->searchSettings && isset($builder->model->searchSettings[‘attributesToHighlight‘]) ) { $attributes = $builder->model->searchSettings[‘attributesToHighlight‘]; foreach ($attributes as $attribute) { $params[‘body‘][‘highlight‘][‘fields‘][$attribute] = new \stdClass(); } } if ($sort = $this->sort($builder)) { $params[‘body‘][‘sort‘] = $sort; } if (isset($options[‘from‘])) { $params[‘body‘][‘from‘] = $options[‘from‘]; } if (isset($options[‘size‘])) { $params[‘body‘][‘size‘] = $options[‘size‘]; } if (isset($options[‘numericFilters‘]) && count($options[‘numericFilters‘])) { $params[‘body‘][‘query‘][‘bool‘][‘must‘] = array_merge($params[‘body‘][‘query‘][‘bool‘][‘must‘], $options[‘numericFilters‘]); } return $this->elastic->search($params); } public function map($results, $model) { if ($results[‘hits‘][‘total‘] === 0) { return Collection::make(); } $keys = collect($results[‘hits‘][‘hits‘]) ->pluck(‘_id‘)->values()->all(); $models = $model->whereIn( $model->getKeyName(), $keys )->get()->keyBy($model->getKeyName()); return collect($results[‘hits‘][‘hits‘])->map(function ($hit) use ($model, $models) { $one = $models[$hit[‘_id‘]]; /** * 这里返回的数据,如果有 highlight,就把对应的 highlight 设置到对象上面 */ if (isset($hit[‘highlight‘])) { $one->highlight = $hit[‘highlight‘]; } return $one; }); } }
我们这里要搜索的是博客,所以在Post模型中添加
use Searchable;
public $searchSettings = [ ‘attributesToHighlight‘ => [ ‘*‘ ] ]; public $highlight = [];
然后在查询数据的时候使用scout的search方法即可。
public function search(Request $request) { $q = $request->get(‘q‘, false); $posts = []; if($q !== false) { $posts = Post::search($q)->paginate(); } return view(‘index‘, compact(‘posts‘, ‘q‘)); }
查询到的数据中,包含 highlight 属性。所以在模版中就可以这样用
@if(isset($post->highlight[‘content‘]))
@foreach($post->highlight[‘content‘] as $item)
...{!! $item !!}...
@endforeach
@else
{{ empty($post->content) ? ‘...‘ : mb_substr($post->content, 0, 300) . ‘...‘ }}
@endif
最终的效果是这样滴
Laravel个人博客集成Elasticsearch和ik分词
标签:sea request getview function map out node params nat
原文地址:https://www.cnblogs.com/johnson108178/p/9185363.html