码迷,mamicode.com
首页 > 编程语言 > 详细

正则表达式-python-无捕获分组与分支选择

时间:2014-05-01 10:05:33      阅读:435      评论:0      收藏:0      [点我收藏+]

标签:style   blog   class   code   java   tar   javascript   color   get   art   int   

无捕获分组

当你要将一部分规则作为一个整体对它进行某些操作,比如指定其重复次数时,你需要将这部分规则用

(?:)

把它包围起来。

分支条件

在正则表达式中,分支条件是一个很常用的条件。

满足条件A 或者 满足条件B  ,这个时候我们就可以使用分支条件了。

分支条件使用的符号为 

       |

 

 

代码示例:

mamicode.com,码迷

我们突然发现,它把字符串分割成两个部分了

I have a dog  和  cat    而不是    I have a dog  和   I have a cat

如果我们只要区分dog和cat呢?正则要怎么写?我添加一个括号试试

mamicode.com,码迷

还是不对,前面的 “I have a ”根本没有匹配

正确的写法是应该使用无捕获分组

mamicode.com,码迷

 

 

来个高级一点的

mamicode.com,码迷
import requests

import re

url = "https://docs.djangoproject.com/en/1.6/intro/tutorial02/"

res = requests.get(url)

html = res.content

# 捕获html中的 href 和 src 后的内容
lines =  re.findall("(?:(?<=href=\")|(?<=src=\"))(?:(?:ht|f)tp[s]?)?.*?(?=\")",html,re.S)

for line in lines:

    print line
mamicode.com,码迷

匹配结果如下:

mamicode.com,码迷
root@kali:/recall/code# python re.py

/s/css/base.de56b042ddc0.css
/s/css/print.ac134bbb8dfc.css
/s/css/docs/docs.0047ef2d621d.css
/s/css/pygments.0d57d48be058.css
https://www.djangoproject.com/
/s/img/site/hdr_logo.b19c5e60269d.gif
https://www.djangoproject.com/
https://www.djangoproject.com/download/
https://docs.djangoproject.com/en/1.6/
https://www.djangoproject.com/weblog/
https://www.djangoproject.com/community/
https://code.djangoproject.com/
/en/1.6/
/en/1.2/intro/tutorial02/
/en/1.3/intro/tutorial02/
/en/1.4/intro/tutorial02/
/en/1.5/intro/tutorial02/
/en/1.7/intro/tutorial02/
/en/dev/intro/tutorial02/
#writing-your-first-django-app-part-2
../tutorial01/
#start-the-development-server
http://127.0.0.1:8000/admin/
../../_images/admin01.png
../../topics/i18n/translation/
#enter-the-admin-site
../../topics/auth/default/#topics-auth-creating-superusers
../../_images/admin02t.png
../../topics/auth/#module-django.contrib.auth
#make-the-poll-app-modifiable-in-the-admin
#explore-the-free-admin-functionality
../../_images/admin03t.png
../../_images/admin04t.png
../../_images/admin05t.png
../../ref/models/fields/#django.db.models.DateTimeField
../../ref/models/fields/#django.db.models.CharField
../../ref/models/fields/#django.db.models.DateTimeField
../../ref/settings/#std:setting-TIME_ZONE
../../_images/admin06t.png
#customize-the-admin-form
../../_images/admin07.png
../../_images/admin08t.png
../../_images/admin09.png
#adding-related-objects
../../_images/admin10.png
../../ref/models/fields/#django.db.models.ForeignKey
../../_images/admin11t.png
../../_images/admin15t.png
../../_images/admin12t.png
#customize-the-admin-change-list
../../_images/admin04t.png
../../_images/admin13t.png
../../_images/admin14t.png
../../ref/models/fields/#django.db.models.DateTimeField
#customize-the-admin-look-and-feel
#customizing-your-project-s-templates
../../ref/settings/#std:setting-TEMPLATE_DIRS
../../ref/settings/#std:setting-TEMPLATE_DIRS
#customizing-your-application-s-templates
../../ref/settings/#std:setting-TEMPLATE_DIRS
../../ref/templates/api/#template-loaders
#customize-the-admin-index-page
../../ref/settings/#std:setting-INSTALLED_APPS
../tutorial03/
../tutorial01/
../tutorial03/
/en/1.6/faq/
http://groups.google.com/group/django-users/
http://groups.google.com/group/django-users/
irc://irc.freenode.net/
http://django-irc-logs.com/
https://code.djangoproject.com/newticket?component=Documentation
#
#start-the-development-server
#enter-the-admin-site
#make-the-poll-app-modifiable-in-the-admin
#explore-the-free-admin-functionality
#customize-the-admin-form
#adding-related-objects
#customize-the-admin-change-list
#customize-the-admin-look-and-feel
#customizing-your-project-s-templates
#customizing-your-application-s-templates
#customize-the-admin-index-page
../tutorial01/
../tutorial03/
/en/1.6/contents/
/en/1.6/genindex/
/en/1.6/py-modindex/
/en/1.6/
../
/m/docs/django-docs-1.6-en.zip
http://media.readthedocs.org/pdf/django/1.6.x/django.pdf
http://media.readthedocs.org/epub/django/1.6.x/django.epub
http://readthedocs.org/
https://www.djangoproject.com/foundation/
https://www.djangoproject.com/trademarks/
http://mediatemple.net/
//ajax.googleapis.com/ajax/libs/jquery/1.10.1/jquery.min.js

 
mamicode.com,码迷

欢迎拍砖。

 

正则表达式-python-无捕获分组与分支选择,码迷,mamicode.com

正则表达式-python-无捕获分组与分支选择

标签:style   blog   class   code   java   tar   javascript   color   get   art   int   

原文地址:http://www.cnblogs.com/tk091/p/3702307.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!