码迷,mamicode.com
首页 > 编程语言 > 详细

python strip_tags 支持保留指定标签

时间:2015-05-06 17:03:44      阅读:143      评论:0      收藏:0      [点我收藏+]

标签:

#coding:utf-8

import re

def strip_tags(string, allowed_tags=‘‘):
  if allowed_tags != ‘‘:
    # Get a list of all allowed tag names.
    allowed_tags = allowed_tags.split(‘,‘)
    allowed_tags_pattern = [‘</?‘+allowed_tag+‘[^>]*>‘ for allowed_tag in allowed_tags]
    all_tags = re.findall(r‘<[^>]+>‘, string, re.I)
    not_allowed_tags = []
    tmp = 0
    for tag in all_tags:
        for pattern in allowed_tags_pattern:
            rs = re.match(pattern,tag)
            if rs:
                tmp += 1
            else:
                tmp += 0
        if not tmp:
            not_allowed_tags.append(tag)
        tmp = 0
    for not_allowed_tag in not_allowed_tags:
        string = re.sub(re.escape(not_allowed_tag), ‘‘,string)
    print not_allowed_tags
  else:
    # If no allowed tags, remove all.
    string = re.sub(r‘<[^>]*?>‘, ‘‘, string)
 
  return string

  

python strip_tags 支持保留指定标签

标签:

原文地址:http://www.cnblogs.com/bushe/p/4482114.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!