标签:
import urllib2 as url import re urls = ‘http://www.php100.com/html/it/‘ headers = {‘User-Agent‘:‘Mozilla/5.0 (X11; U; Linux i686)Gecko/20071127 Firefox/2.0.0.11‘} req = url.Request(urls,headers=headers) htm = url.urlopen(urls) content = htm.read() res = r‘<h2>(.*)<\/h2>‘ r = re.compile(res,re.I) res_title=r‘<title>(.*)</title>‘ r_title = re.compile(res_title,re.I) arr_title = r_title.findall(content) print arr_title[0] arr = r.findall(content) for i in arr: print i
标签:
原文地址:http://www.cnblogs.com/legend-song/p/4582154.html