HTML Strip Char Filter

时间：2017-08-03 11:25:07 阅读：200 评论：0 收藏：0 [点我收藏+]

标签：load param parameter curl nbsp analysis conf should sci

The html_strip character filter strips HTML elements from the text and replaces HTML entities with their decoded value (e.g. replacing & with &).

Example outputedit

POST _analyze
{
  "tokenizer":      "keyword",

  "char_filter":  [ "html_strip" ],
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

COPY AS CURL VIEW IN CONSOLE

The keyword tokenizer returns a single term.

The above example returns the term:

[ \nI‘m so happy!\n ]

The same example with the standard tokenizer would return the following terms:

[ I‘m, so, happy ]

Configurationedit

The html_strip character filter accepts the following parameter:

escaped_tags

An array of HTML tags which should not be stripped from the original text.

Example configurationedit

In this example, we configure the html_strip character filter to leave <b> tags in place:

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "keyword",
          "char_filter": ["my_char_filter"]
        }
      },
      "char_filter": {
        "my_char_filter": {
          "type": "html_strip",
          "escaped_tags": ["b"]
        }
      }
    }
  }
}

POST my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

COPY AS CURL VIEW IN CONSOLE

The above example produces the following term:

[ \nI‘m so <b>happy</b>!\n ]


源文：https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html#analysis-htmlstrip-charfilter

HTML Strip Char Filter

标签：load param parameter curl nbsp analysis conf should sci

原文地址：http://www.cnblogs.com/a-du/p/7278302.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行