码迷,mamicode.com
首页 > 其他好文 > 详细

ES highlight测试

时间:2015-07-14 13:24:47      阅读:2039      评论:0      收藏:0      [点我收藏+]

标签:

Lucene提供3种高亮功能:highlighter, fast-vector-highlighter or postings-highlighter.

  • highlighter: 最基本的、默认的高亮器。需要对查询的_source进行二次Reanalyzed,速度在3种高亮器里最慢,但不需要额外存储index。
  • postings-highighlighter: setting中需要配置"index_options" : "offsets",postings优缺点。速度中等,但是在phrase(短语查询) query结合的查询中,会把查询短语的每个词单独高亮显示。
  • fast-vector-highligh: setting中需要配置"term_vector" : "with_positions_offsets",速度最快,但是占用存储空间最大。典型空间换速度。

测试

  • mapping

    1. curl -XPUT ‘localhost:9200/hl-test‘ -d ‘{
    2. "settings": {
    3. "index": {
    4. "number_of_shards": 2,
    5. "number_of_replicas": 0
    6. }
    7. },
    8. "mappings": {
    9. "tm": {
    10. "properties": {
    11. "content1": {
    12. "type": "string",
    13. "analyzer" : "default",
    14. "store": "yes",
    15. "term_vector" : "with_positions_offsets"
    16. },
    17. "content2": {
    18. "type": "string",
    19. "analyzer" : "default",
    20. "store": "yes",
    21. "index_options" : "offsets"
    22. },
    23. "content3": {
    24. "type": "string",
    25. "store": "yes",
    26. "analyzer" : "default"
    27. },
    28. "content4": {
    29. "type": "string",
    30. "store": "yes",
    31. "index": "not_analyzed"
    32. }
    33. }
    34. }
    35. }
    36. }‘

    note

    offsets
    Store docs, freqs, positions, and the start and end character offsets of each term in the original string. This information is used by the postings >highlighter but is disabled by default.
    来源: https://www.elastic.co/guide/en/elasticsearch/guide/current/stopwords-phrases.html#index-options

  • 测试数据
    curl -XPUT ‘http://localhost:9200/hl-test/tm/1‘ -d ‘{ "content1": "In the above case, the content field will be highlighted for each search hit (there will be another element in each search hit, called highlight, which includes the highlighted fields and the highlighted fragments)." }‘

  • Query DSL
    1. {
    2. "query": {
    3. "term": {
    4. "content1": "the"
    5. }
    6. },
    7. "highlight": {
    8. "pre_tags": [
    9. "<tag1>"
    10. ],
    11. "post_tags": [
    12. "</tag1>"
    13. ],
    14. "fields": {
    15. "content1": {
    16. "type": "fvh",
    17. "fragment_size": 30,
    18. "number_of_fragments": 1,
    19. "force_source": true,
    20. "order": "score",
    21. "fragment_offset": 3,
    22. "no_match_size": 2
    23. },
    24. "content2": {
    25. "fragment_size": 250,
    26. "number_of_fragments": 0
    27. },
    28. "content3": {
    29. "fragment_size": 250,
    30. "number_of_fragments": 3,
    31. "force_source": true
    32. }
    33. }
    34. }
    35. }




ES highlight测试

标签:

原文地址:http://www.cnblogs.com/jasonbrooke/p/4645049.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!