elasticsearch集群&&IK分词器&&同义词

时间：2016-07-12 15:37:35 阅读：152 评论：0 收藏：0 [点我收藏+]

wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.3/elasticsearch-2.3.3.tar.gz

集群安装：

三个节点：master,slave1,slvae2

vi elasticsearch.yml

cluster.name: my-application

node.name: node-3(节点独有的名称，注意唯一性)

network.host: 192.168.137.117

http.port: 9200

discovery.zen.ping.unicast.hosts: ["master","slave1", "slave2"]

安装插件

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install analysis-icu

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install mobz/elasticsearch-head

marvel：

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install license

/home/qun/soft/elasticsearch-2.3.3/bin/plugin install marvel-agent

在各个节点上执行：

elasticsearch -d

杀死节点

kill -9 `ps -ef|grep elasticsearch|awk ‘{print $2}‘`

启动

/home/qun/soft/elasticsearch-2.3.3/bin/elasticsearch -d

访问集群：

http://master:9200/_plugin/head/

一个节点(node)就是一个Elasticsearch实例，而一个集群(cluster)由一个或多个节点组成，它们具有相同的cluster.name，

它们协同工作，分享数据和负载。当加入新的节点或者删除一个节点时，集群就会感知到并平衡数据。

做为用户，我们能够与集群中的任何节点通信，包括主节点。每一个节点都知道文档存在于哪个节点上，它们可以转发请求到相应的节点上。

我们访问的节点负责收集各节点返回的数据，最后一起返回给客户端。这一切都由Elasticsearch处理。

获取集群状态

http://master:9200/_cluster/health/

{
"cluster_name": "my-application",
"status": "green",
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 3,
"active_primary_shards": 22,
"active_shards": 44,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}

设置添加分片

PUT /blogs/_settings

{

"number_of_replicas" : 2

}

删除索引

curl -XDELETE ‘http://master:9200/.marvel-es-1-2016.05.29‘

curl -XDELETE ‘http://master:9200/.marvel-es-data-1‘

安装IK分词器(https://github.com/medcl/elasticsearch-analysis-ik)

wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip

mvn package

mkdir -p /home/qun/soft/elasticsearch-2.3.4/plugins/ik

cp /home/qun/soft/elasticsearch-2.3.3/elasticsearch-analysis-ik-master/target/releases/elasticsearch-analysis-ik-1.9.3.zip /home/qun/soft/elasticsearch-2.3.3/plugins/ik

unzip elasticsearch-analysis-ik-1.9.3.zip

测试分词

/twitter/_analyze?analyzer=standard&pretty=true&text=我是中国人

/twitter/_analyze?analyzer=ik&pretty=true&text=我是中国人

添加用户自定义词典：

elasticsearch-2.3.3/plugins/ik/config/IKAnalyzer.cfg.xml

栗子:添加sougou.dic,分号分隔，相对路径，重启es集群

<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>
 <!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords">custom/ext_stopword.dic</entry>
<!--用户可以在这里配置远程扩展字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

配置同义词

修改：elasticsearch-2.3.3/config/elasticsearch.yml,在末尾加上如下内容

index:
  analysis:
    analyzer:
      ik_syno:
          type: custom
          tokenizer: ik_max_word
          filter: [my_synonym_filter]
      ik_syno_smart:
          type: custom
          tokenizer: ik_smart
          filter: [my_synonym_filter]
    filter:
      my_synonym_filter:
          type: synonym
          synonyms_path: analysis/synonym.txt

添加词典：

mkdir -p elasticsearch-2.3.3/config/analysis

vi elasticsearch-2.3.3/config/analysis/synonym.txt

ipod, i-pod, i pod

foozball , foosball

universe , cosmos

西红柿, 番茄

马铃薯, 土豆

测试同义词：

GET /iktest/_analyze?analyzer=ik_syno_smart&pretty=true&text=马铃薯西红柿

结果：

{
"tokens": [
{
"token": "马铃薯",
"start_offset": 0,
"end_offset": 3,
"type": "CN_WORD",
"position": 0
}
,
{
"token": "土豆",
"start_offset": 0,
"end_offset": 3,
"type": "SYNONYM",
"position": 0
}
,
{
"token": "西红柿",
"start_offset": 3,
"end_offset": 6,
"type": "CN_WORD",
"position": 1
}
,
{
"token": "番茄",
"start_offset": 3,
"end_offset": 6,
"type": "SYNONYM",
"position": 1
}
]
}

elasticsearch集群&&IK分词器&&同义词

标签：elasticsearch

原文地址：http://11670039.blog.51cto.com/11660039/1825728

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行