标签:info cep 图片 span play turn ble from 数据
|
第五周 |
所花时间 |
15h左右 |
代码量 |
1000行左右 |
博客量 |
4篇 |
学到的知识点 |
python基础的一些知识 |
摘要:通过学习,对python中的BeautifulSoup有了一定的了解,通过和之前学过的requests库结合,爬取了2019年中国大学的排名。(数据均来自HTML页面)
一、中国大学排名
import requests from bs4 import BeautifulSoup import bs4 def getHTMLText(url): try: r = requests.get(url, timeout=30) r.raise_for_status() r.encoding = r.apparent_encoding return r.text except: return "" def fillUnivList(ulist, html): soup = BeautifulSoup(html, "html.parser") for tr in soup.find(‘tbody‘).children: if isinstance(tr, bs4.element.Tag): tds = tr(‘td‘) ulist.append([tds[0].string, tds[1].string, tds[3].string]) def printUnivList(ulist, num): tplt = "{0:^10}\t{1:{3}^10}\t{2:^10}" print(tplt.format("排名", "学校名称", "总分", chr(12288))) for i in range(num): u = ulist[i] print(tplt.format(u[0], u[1], u[2], chr(12288))) def main(): uinfo = [] url = ‘http://www.zuihaodaxue.com/Greater_China_Ranking2019_0.html‘ html = getHTMLText(url) fillUnivList(uinfo, html) printUnivList(uinfo, 20) # 20 univs main()
标签:info cep 图片 span play turn ble from 数据
原文地址:https://www.cnblogs.com/MoooJL/p/12541924.html