码迷,mamicode.com
首页 > 其他好文 > 详细

Laravel个人博客集成Elasticsearch和ik分词

时间:2018-06-14 23:14:43      阅读:978      评论:0      收藏:0      [点我收藏+]

标签:sea   request   getview   function   map   out   node   params   nat   

在之前的博客中,写了一篇用laravel5.5和vue写的个人博客。GitHub地址为:https://github.com/Johnson19900110/phpJourney。最近有空,就想着把Elasticsearch集成了进来。

因为博主比较懒,在博客园写博客,所以个人博客就没有同步了,因此就用php的一个爬虫库 fabpot/goutte 把自己博客园文章爬到了自己博客上。

技术分享图片

代码如下:

<?php
namespace App\Libraries;

use App\Post;
use Goutte\CLient;
use Symfony\Component\DomCrawler\Crawler;

class CnblogsPostSpider {

    protected $client;

    protected $crawler;

    protected $urls = [];

    public function __construct(Client $client, $url)
    {
        $this->client = $client;
        $this->crawler = $client->request(GET, $url);
    }

    public function getUrls()
    {
        $urls = $this->crawler->filter(.postTitle > a)->each(function ($node) {
            return $node->attr(href);
        });

        foreach ($urls as $url) {
            $crawler = $this->client->request(GET, $url);

            $cnBlogId = $this->getCnBlogId($url);

            $post = new Post();
            if($post->where(cnblogs_id, $cnBlogId)->count()) {
                // 已爬过该博客,只更新阅读和评论数
                $post->where(cnblogs_id, $cnBlogId)->update([
                    views         => $this->getViews($crawler),
                    comments      => $this->getComments($crawler),
                ]);
            }else {
                $post->insert([
                    title         => $this->getTitle($crawler),
                    category_id   => 1,
                    content       => $this->getContent($crawler),
                    user_id       => 1,
                    views         => $this->getViews($crawler),
                    comments      => $this->getComments($crawler),
                    cnblogs_id    => $cnBlogId,
                    cnblogs_url   => $url,
                    created_at    => $this->getCreatedAt($crawler),
                ]);
            }
        }
    }

    public function getCnBlogId($url)
    {
        $url_arr = explode(/, $url);
        $last = array_pop($url_arr);
        $path_arr = explode(., $last);
        return intval(array_shift($path_arr));
    }

    protected function getTitle(Crawler $crawler)
    {
        return trim($crawler->filter(.postTitle > a)->text());
    }

    protected function getContent(Crawler $crawler)
    {
        return trim($crawler->filter(#cnblogs_post_body)->text());
    }

    protected function getViews(Crawler $crawler)
    {
        return intval(trim($crawler->filter(#post_view_count)->text()));
    }

    protected function getComments(Crawler $crawler)
    {
        return intval($crawler->filter(#post_comment_count)->text());
    }

    protected function getCreatedAt(Crawler $crawler)
    {
        return trim($crawler->filter(#post-date)->text());
    }
}

然后开始使用Laravel scout 集成ES:

首先,先下载ES包:

 composer require tamayo/laravel-scout-elastic 

这个包依赖 Laravel scout包,所以也就顺便装好了。

然后 publish config 和添加  ServiceProviders 。

这时候就可以装ES了。因为我们要使用中文分词 ik 插件,在安装ik插件的时候,如果我们自己取想办法安装会浪费你很多精力。

因为博主也是刚接触ES,所以我们直接使用现成的项目: https://github.com/medcl/elasticsearch-rtf

这个项目当前的版本是 Elasticsearch 5.1.1,当然ik 插件也就顺便装好了。

$ curl http://localhost:9200

{
  "name" : "Rkx3vzo",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "Ww9KIfqSRA-9qnmj1TcnHQ",
  "version" : {
    "number" : "5.1.1",
    "build_hash" : "5395e21",
    "build_date" : "2016-12-06T12:36:15.409Z",
    "build_snapshot" : false,
    "lucene_version" : "6.3.0"
  },
  "tagline" : "You Know, for Search"
}

当你出现这个界面,说明ES已经装好了。

这时候就可以创建一个 artisan 命令,来创建ES的index和template。

<?php

namespace App\Console\Commands;

use GuzzleHttp\Client;
use Illuminate\Console\Command;

class InitEs extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature = es:init;

    /**
     * The console command description.
     *
     * @var string
     */
    protected $description = Init es to create index;

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();
    }

    /**
     * Execute the console command.
     *
     * @return mixed
     */
    public function handle()
    {
        //
        $client = new Client();
        $this->createTemplate($client);
        $this->createIndex($client);
    }

    public function createTemplate(Client $client)
    {
        $url = config(scout.elasticsearch.hosts)[0] . :9200/ . _template/rtf;
        $client->put($url, [
            json => [
                template => *,
                settings => [
                    number_of_shards => 1
                ],
                mappings => [
                    _default_ => [
                        _all => [
                            enabled => true
                        ],
                        dynamic_templates => [
                            [
                                strings => [
                                    match_mapping_type => string,
                                    mapping => [
                                        type => text,
                                        analyzer => ik_smart,
                                        ignore_above => 256,
                                        fields => [
                                            keyword => [
                                                type => keyword
                                            ]
                                        ]
                                    ]
                                ]
                            ]
                        ]
                    ]
                ]
            ]
        ]);

    }

    public function createIndex(Client $client)
    {
        $url = config(scout.elasticsearch.hosts)[0] . :9200/ . config(scout.elasticsearch.index);
        $client->put($url, [
            json => [
                settings => [
                    refresh_interval => 5s,
                    number_of_shards => 1,
                    number_of_replicas => 0,
                ],
                mappings => [
                    _default_ => [
                        _all => [
                            enabled => false
                        ]
                    ]
                ]
            ]
        ]);
    }
}

因为 tamayo/laravel-scout-elastic 不带 highlight 功能,所以我们需要稍微修改一下。新建一个EsEngine继承ElasticsearchEngine类,然后重写几个方法即可。

<?php
/**
 * Created by PhpStorm.
 * User: johnson
 * Date: 2018/6/14
 * Time: 下午3:10
 */

namespace App\Libraries;


use Laravel\Scout\Builder;
use ScoutEngines\Elasticsearch\ElasticsearchEngine;
use Illuminate\Database\Eloquent\Collection;

class EsEngine extends ElasticsearchEngine
{
    public function search(Builder $builder)
    {
        return $this->performSearch($builder, array_filter([
            numericFilters => $this->filters($builder),
            size => $builder->limit,
        ]));
    }

    protected function performSearch(Builder $builder, array $options = [])
    {
        $params = [
            index => $this->index,
            type => $builder->model->searchableAs(),
            body => [
                query => [
                    bool => [
                        must => [
                            [
                                query_string => [
                                    query => "*{$builder->query}*",
                                ]
                            ]
                        ]
                    ]
                ],
            ]
        ];
        /**
         * 这里使用了 highlight 的配置
         */
        if ($builder->model->searchSettings
            && isset($builder->model->searchSettings[attributesToHighlight])
        ) {
            $attributes = $builder->model->searchSettings[attributesToHighlight];
            foreach ($attributes as $attribute) {
                $params[body][highlight][fields][$attribute] = new \stdClass();
            }
        }

        if ($sort = $this->sort($builder)) {
            $params[body][sort] = $sort;
        }

        if (isset($options[from])) {
            $params[body][from] = $options[from];
        }

        if (isset($options[size])) {
            $params[body][size] = $options[size];
        }

        if (isset($options[numericFilters]) && count($options[numericFilters])) {
            $params[body][query][bool][must] = array_merge($params[body][query][bool][must],
                $options[numericFilters]);
        }

        return $this->elastic->search($params);
    }

    public function map($results, $model)
    {
        if ($results[hits][total] === 0) {
            return Collection::make();
        }

        $keys = collect($results[hits][hits])
            ->pluck(_id)->values()->all();

        $models = $model->whereIn(
            $model->getKeyName(), $keys
        )->get()->keyBy($model->getKeyName());

        return collect($results[hits][hits])->map(function ($hit) use ($model, $models) {

            $one = $models[$hit[_id]];
            /**
             * 这里返回的数据,如果有 highlight,就把对应的  highlight 设置到对象上面
             */
            if (isset($hit[highlight])) {
                $one->highlight = $hit[highlight];
            }
            return $one;
        });
    }
}

我们这里要搜索的是博客,所以在Post模型中添加

  use Searchable;
  public
$searchSettings = [ attributesToHighlight => [ * ] ]; public $highlight = [];

然后在查询数据的时候使用scout的search方法即可。

public function search(Request $request)
    {
        $q = $request->get(q, false);

        $posts = [];
        if($q !== false) {
            $posts = Post::search($q)->paginate();
        }

        return view(index, compact(posts, q));
    }

查询到的数据中,包含 highlight 属性。所以在模版中就可以这样用


@if(isset($post->highlight[‘content‘]))
@foreach($post->highlight[‘content‘] as $item)
...{!! $item !!}...
@endforeach
@else
{{ empty($post->content) ? ‘...‘ : mb_substr($post->content, 0, 300) . ‘...‘ }}
@endif

最终的效果是这样滴

技术分享图片

Laravel个人博客集成Elasticsearch和ik分词

标签:sea   request   getview   function   map   out   node   params   nat   

原文地址:https://www.cnblogs.com/johnson108178/p/9185363.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!