Python抓取网页内容

时间：2014-08-11 20:32:22 阅读：197 评论：0 收藏：0 [点我收藏+]

import urllib
import re
def getHtml(url):
    page=urllib.urlopen(url)
    html=page.read()
    return html
html= getHtml("http://tieba.baidu.com/p/2460150866")
print ‘Size is:‘,len(html)
f=file(‘a.html‘,‘w‘)
f.write(html)
f.close()

Python的urllib模块还是很好用的,顺便把抓到的网页内容写到a.html里,然后模式匹配各个html标签,想得到什么都不是问题啦~~~

Python抓取网页内容,布布扣,bubuko.com

Python抓取网页内容

标签：style blog http color os 问题 div html

原文地址：http://www.cnblogs.com/liumumu2014/p/3905180.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行