标签:用法 图片 dev dex ref 报表 elastic user log
示范说明:
##查询 nginxlog 状态码为200,最近5分钟内的数据(即:"now-5M",注意!!此处时间基准为@timetamp,数据入ES的时间,并非日志时间; )
GET /nginxlog-2019.11.22/_search { "query": { "match" : { "statuscode" : "200" } }, "aggs":{ "@timestamp": { "filter": { "range": { "sold": { "from": "now-5M" } } } } }, "size": 2, "from": 0, "_source": [ "type", "verb", "upstream_response_time","statuscode" ] }
解释说明:
① "_source" 指定输出字段;以SQL语法解释来说,类似于 select "type", "verb", "upstream_response_time","statuscode" from nginxlog 的意思;
② "size" 限制输出;类似于mysql 的LIMIT 操作;(注:此处为了演示只输出2条记录)
③ "from" 开始输出位置;类似于mysql 的LIMIT 操作; Limit 0,2
注:官方说明,from定义了目标数据的偏移值,size定义当前返回的事件数目;(为了更好的说明,上述抽象的表达了下概念)
##查询返回结果
{ "took": 17, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1448153, "max_score": 1, "hits": [ { "_index": "nginxlog-2019.11.22", "_type": "log", "_id": "AW6QrSYLcN5cQJSYWJEL", "_score": 1, "_source": { "statuscode": 200, "verb": "POST", "upstream_response_time": 0.014, "type": "log" } }, { "_index": "nginxlog-2019.11.22", "_type": "log", "_id": "AW6QrSYLcN5cQJSYWJEM", "_score": 1, "_source": { "statuscode": 200, "verb": "POST", "upstream_response_time": 0.069, "type": "log" } } ] }, "aggregations": { "@timestamp": { "doc_count": 0 } } }
## 同上,以下不做详细的说明,只简单解析相应的用法,具体规范细则,请参考官方文档; 时间区域查询: ##写法①
GET /nginxlog-2019.11.22/_search { "query": { "match" : { "statuscode" : "200" } }, "aggs":{ "@timestamp": { "filter": { "range" : { "@timestamp" : { "gte" : "now-1m", "lt" : "now" } } } } }, "size": 100, "from": 0, "_source": [ "log_timestamp","statuscode", "@timestamp"] }
##写法② GET /nginxlog-2019.11.22/_search { "query": { "range" : { "@timestamp" : { "gte" : "now-1m", "lt" : "now" } }, "match" : { "statuscode" : "200" } }, "size": 100, "_source": [ "first_name", "last_name","log_timestamp","@timestamp","statuscode"] }
说明:
eq 相等
ne、neq 不相等
gt 大于
lt 小于
gte、ge 大于等于
lte、le 小于等于
##写法③ 组合查询: GET /nginxlog-2019.01.24/_search { "query": { "bool": { "must": { "match": { "statuscode" : "200" }}, "must": {"range" : {"@timestamp" : {"gte" : "now-1m","lt" : "now"}} } } }, "size": 100, "_source": [ "first_name", "last_name","log_timestamp","@timestamp","statuscode"] }
##熟悉了上面的查询方法后,我们就开始写我们的Python 代码了;
# -*- coding: utf-8 -*- """ Created on Thu Jun 7 10:08:02 2018 @author: illusioned """ from elasticsearch import Elasticsearch from elasticsearch import helpers def GetVlue(): es = Elasticsearch(["192.168.10.121:9200"],timeout=99999) #注意此处timeout,不专门设置,查询百分百超时报错; ###发现木有,查询语句非常相似; query ={ "query": { "bool": { "must": [ {"match": { "actaion" : "visit" }}, {"match": { "survey_id" : "471101"}}, {"match": { "ref" : "http*"}} ] } }, "size": 100, "_source": [ "@timestamp","action","survey_id","ref","user_id"] } ##是否非常像SQL语句: ## select "@timestamp","action","survey_id","ref","user_id" from jsonlog-2019.11.* ## where action="visit" and survey_id="471101" and ref="http*" scanResp = helpers.scan(client=es, query=query, scroll="3m", index=‘jsonlog-2019.11.*‘, doc_type=‘log‘, timeout="10m") for k in scanResp: yield k def write_file(k): ##指向读取需要导出的数据字段,注意编码问题; with open(‘China_news.csv‘,‘ab‘) as f: k=dict(k) f.write(str(k[‘_source‘][‘@timestamp‘]).encode(‘gbk‘)) f.write(b‘,‘) f.write(str(k[‘_source‘][‘action‘]).encode(‘gbk‘)) f.write(b‘,‘) f.write(str(k[‘_source‘][‘survey_id‘]).encode(‘gbk‘)) f.write(b‘,‘) f.write(str(k[‘_source‘][‘ref‘]).encode(‘gbk‘)) f.write(b‘,‘) f.write(str(k[‘_source‘][‘user_id‘]).encode(‘gbk‘)) f.write(b‘\n‘) if __name__=="__main__": list1 = GetVlue() for index,k in enumerate(list1,1): write_file(k) print(‘正在导出‘+str(index)+‘条数据‘)
=====================================================
分享个ElasticSearch 有个非常有意思的玩法;
熟悉数据库的朋友应该知道,SQL 的update 语句要带条件的,不可更新自己,正常情况下会报错;
但ElasticSearch却实实在在的可以这么玩,如下查询更新;
POST /zipkin:span-2019-03-05/span/_update_by_query?conflicts=proceed { "script": { "source": "ctx._source[‘parentId‘]=‘bf98782dbd7e9304‘;", }, "query": { "term": { {"parentId":352006966a095911} } } } ##根据条件查询出"parentId"=352006966a095911,然后更新为"bf98782dbd7e9304";
Python 导数 Elasticsearch 元数据到CSV
标签:用法 图片 dev dex ref 报表 elastic user log
原文地址:https://www.cnblogs.com/illusioned/p/11915063.html