标签:gen 时间 rom dem apple 简单 div font style
基本程序(第一次实战),简单写写,有时间进行修改扩展。(requests ;urllib.parse;BeautifulSoup)
1 import requests 2 from bs4 import BeautifulSoup 3 import urllib.parse 4 5 headers = {‘User-Agent‘:"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1"} 6 all_url=‘http://www.shixiseng.com/interns?k=%E7%88%AC%E8%99%AB&p=1‘ 7 start_html=requests.get(all_url,headers=headers) 8 soup=BeautifulSoup(start_html.text,‘lxml‘) 9 href=soup.find_all(‘div‘,{‘class‘:‘names cutom_font‘}) 10 for link in href: 11 l=link.find_all(‘a‘) 12 for l2 in l: 13 title=l2.get_text() 14 a=l2[‘href‘] 15 url_all=urllib.parse.urljoin(‘http://www.shixiseng.com/‘,a) 16 html=requests.get(url_all) 17 soup2=BeautifulSoup(html.text,‘lxml‘) 18 data=soup2.find_all(‘div‘,class_=‘job_detail‘) 19 for datas in data: 20 data1=datas.find_all(‘p‘) 21 for data2 in data1: 22 print(data2.get_text())
标签:gen 时间 rom dem apple 简单 div font style
原文地址:http://www.cnblogs.com/realmonkeykingsun/p/7859247.html