日志分析代码实现(字符串切割)

时间：2017-11-09 22:32:42 阅读：97 评论：0 收藏：0 [点我收藏+]

标签：日志

日志分析代码实现(字符串切割)

思路

    不使用正则表达式处理:
        进行字符串切割
        将[]和"括起的内容特殊处理
        将每段数据转换为对应格式
        代码精简,代码效率检查

import datetime # 目标日志 logline = ‘‘‘183.60.212.153 - - [19/Feb/2013:10:23:29 +0800] \ "GET /o2o/media.html?menu=3 HTTP/1.1" 200 16691 "-" \ "Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)"‘‘‘ clean_log = logline.split() # list #[‘183.60.212.153‘, ‘-‘, ‘-‘, ‘[19/Feb/2013:10:23:29‘, ‘+0800]‘,\ # ‘"GET‘, ‘/o2o/media.html?menu=3‘, ‘HTTP/1.1"‘, ‘200‘, ‘16691‘, \ # ‘"-"‘, ‘"Mozilla/5.0‘, ‘(compatible;‘, ‘EasouSpider;‘, ‘+http://www.easou.com/search/spider.html)"‘] # 转换时间格式 def convert_time(time:str): return datetime.datetime.strptime(time, ‘%d/%b/%Y:%H:%M:%S %z‘) # 将request字符串切分为三段 def convert_request(request:str): return dict(zip((‘method‘,‘url‘,‘protocol‘),request.split())) # 给予对应字段名 names = [ ‘remote‘,‘‘,‘‘,‘time‘, ‘request‘,‘status‘,‘size‘,‘‘, ‘useragent‘ ] # 处理对应字段名的函数 operations = [ None,None,None,convert_time, convert_request,int,int,None, None ] # 切割字符串为合适格式 def log_clean(line:str,ret=None): if ret: ret = [] tmp = ‘‘ flag = False for word in line.split(): if word.startswith(‘[‘) or word.startswith(‘"‘): tmp = word.strip(‘["‘) if word.endswith(‘"‘) or word.endswith(‘]‘): ret.append(tmp) flag = False continue flag = True continue if flag: tmp += ‘ ‘ + word if word.endswith(‘"‘) or word.endswith(‘]‘): ret.append(tmp.strip(‘"]‘)) flag = False continue else: ret.append(word) # 遍历处理后日志,根据对应字段,进行对应处理后再保存至新字典中 ret_d = {} log_clean(logline) for i, field in enumerate(ret): key = names[i] if operations[i]: ret_d[key] = operations[i](field) else: ret_d[key] = field print(ret_d)

本文出自 “12064120” 博客，请务必保留此出处http://12074120.blog.51cto.com/12064120/1980427

日志分析代码实现(字符串切割)

标签：日志

原文地址：http://12074120.blog.51cto.com/12064120/1980427

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行