码迷,mamicode.com
首页 > Web开发 > 详细

HTML Strip Char Filter

时间:2017-08-03 11:25:07      阅读:200      评论:0      收藏:0      [点我收藏+]

标签:load   param   parameter   curl   nbsp   analysis   conf   should   sci   

The html_strip character filter strips HTML elements from the text and replaces HTML entities with their decoded value (e.g. replacing & with &).

Example outputedit

POST _analyze
{
  "tokenizer":      "keyword", 
技术分享
  "char_filter":  [ "html_strip" ],
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

技术分享

The keyword tokenizer returns a single term.

The above example returns the term:

[ \nI‘m so happy!\n ]

The same example with the standard tokenizer would return the following terms:

[ I‘m, so, happy ]

Configurationedit

The html_strip character filter accepts the following parameter:

escaped_tags

An array of HTML tags which should not be stripped from the original text.

Example configurationedit

In this example, we configure the html_strip character filter to leave <b> tags in place:

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "keyword",
          "char_filter": ["my_char_filter"]
        }
      },
      "char_filter": {
        "my_char_filter": {
          "type": "html_strip",
          "escaped_tags": ["b"]
        }
      }
    }
  }
}

POST my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

The above example produces the following term:

[ \nI‘m so <b>happy</b>!\n ]


源文:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html#analysis-htmlstrip-charfilter

HTML Strip Char Filter

标签:load   param   parameter   curl   nbsp   analysis   conf   should   sci   

原文地址:http://www.cnblogs.com/a-du/p/7278302.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!