Python日志分析与正则表达式

时间：2017-08-17 10:34:08 阅读：262 评论：0 收藏：0 [点我收藏+]

程序员经常会面临日志的分析工作。而正则表达式是处理日志的必备工具。

“Line 622: 01-01 09:04:16.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf] update_freeze_data: dabc9:bl_level=1740”
“app_log_cat new log begin”
“Line 627: 01-01 09:04:17.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf] update_freeze_data: dabc:bl_level=1720”

比如，对于上面的日志，需要找到日志时间，并且要找到对应格式的数据。这里面包含的问题主要包括：

匹配工作。需要找到真正的日志，上面的第二行就不是真正的日志；
分割工作(split)。把日志按照空格进行分割，找到日志时间；
筛选工作。找到匹配的格式，从而把数字1740和1720筛选出来。

针对匹配工作，需要找到开头是 ‘Line‘ 的行。用到re的search()函数。

import re

strrs = list()
strrs.append("Line 622: 01-01 09:04:16.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf]           update_freeze_data: dabc9:bl_level=1740")
strrs.append("app_log_cat new log begin")
strrs.append("Line 627: 01-01 09:04:17.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf]           update_freeze_data: dabc:bl_level=1720")
regex = r‘Line‘
for strr in strrs:
    str_search = re.match(regex, strr)
    if str_search:
        print(True)
    else:
        print(False)

匹配结果如下

True
False
True

针对分割工作，需要找到日志时间。观察上述日志，是以空格作为分割依据。

import re

strr = ‘Line 622: 01-01 09:04:16.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf]‘        ‘ update_freeze_data: dabc9:bl_level=1740‘
regex = ‘\s‘
str_split = re.split(regex, strr)
print(str_split)

分割后的输出是一个list，在其中选择时间对应的数据即可。

[‘Line‘, ‘622:‘, ‘01-01‘, ‘09:04:16.727‘, ‘<6>‘, ‘[pid:14399,‘, ‘cpu1‘, ‘dabc_pwym_task]‘, ‘[histbf]‘, ‘update_freeze_data:‘, ‘dabc9:bl_level=1740‘]

针对筛选工作，需要找到最后的数据。

import re

strr = """Line 622: 01-01 09:04:16.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf]
          update_freeze_data: dabc9:bl_level=1740
          Line 627: 01-01 09:04:17.727 <6> [pid:14399, cpu1 dabc_pwym_task] [histbf] 
          update_freeze_data: dabc:bl_level=1720"""
regex = r‘dabc\d?:bl_level=(\d+)‘
str_select = re.findall(regex, strr)
print(str_select)

筛选后的结果是一个list

[‘1740‘, ‘1720‘]

Python日志分析与正则表达式

标签：包括空格 mat sel 正则工具表达 class 匹配

原文地址：http://www.cnblogs.com/wangjingchn/p/7376684.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行