re库、正则表达式基本使用

时间：2018-03-03 14:09:46 阅读：153 评论：0 收藏：0 [点我收藏+]

标签：字符集正则表达式标准库 print 贪婪模式字符 [] image col

re库是python的标准库，主要用于字符串匹配。
Re库主要功能函数

技术分享图片

re.search()函数

技术分享图片

re.match()函数

正则表达式

1.特殊字符

^h表示以h开头，.表示任意字符，*表示任意多次

import re
line = ‘hello 123‘
# ^h表示以h开头，.表示任意字符，*表示任意多次
re_str = ‘^h.*‘
if re.match(re_str, line):
    print(‘匹配成功‘)  # 输出：匹配成功

$表示结尾字符

import re
line = ‘hello 123‘
re_str = ‘.*3$‘ # 前面可为任意多个任意字符，但结尾必须是3
if re.match(re_str, line):
    print(‘匹配成功‘)  # 输出：匹配成功

?表示非贪婪模式

import re
line = ‘heeeello123‘
re_str = ‘.*?(h.*?l).*‘  # 只要()中的子串
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：heeeel
                              # 如果去掉?,则输出：heeeell

+表示至少出现一次

import re
line = ‘heeeello123‘
re_str = ‘.*(h.+?l).*‘
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) #输出：heeeel

{2}表示前面字符出现2次

import re
line = ‘heeeello123‘
re_str = ‘.*?(e.{2}?l).*‘ # 匹配的是e+任意2个字符+l
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：eeel

| 表示或

import re
line = ‘hello123‘
re_str = ‘((hello|heeello)123)‘
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：python123

[]表示对单个字符给出取值范围

import re
line = ‘hello123‘
re_str = "([jhk]ello123)"  # [jhk]表示jhk中的任一个都可以
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：hello123

[^]表示非字符集

import re
line = ‘hello123‘
re_str = "([^j]ello123)" # [^j]表示不是j的都行
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：hello123

\s表示空格 \S表示非空格

import re
line = ‘hello123 好‘ #字符串有空格
re_str = "(hello123\s好)"  # 匹配上空格
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) #输出：hello123 好

[\u4E00-\u9FA5]表示汉字

import re
line = ‘hello 北京大学‘
re_str = ".*?([\u4E00-\u9FA5]+大学)"
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1)) # 输出：北京大学

小例子提取出生日期

import re
line = ‘xxx出生于2000年6月1日‘
line = ‘xxx出生于2000/6/1‘
line = ‘xxx出生于2000-6-1‘
line = ‘xxx出生于2000-06-01‘
line = ‘xxx出生于2000-06‘
re_str = ".*出生于(\d{4}[年/-]\d{1,2}([月/-]|\d{1,2}|$))"
match_obj = re.match(re_str, line)
if match_obj:
    print(match_obj.group(1))

re库、正则表达式基本使用

标签：字符集正则表达式标准库 print 贪婪模式字符 [] image col

原文地址：https://www.cnblogs.com/zhang-anan/p/8496139.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

re库、正则表达式基本使用

正则表达式

1.特殊字符

小例子 提取出生日期

小例子提取出生日期