码迷,mamicode.com
首页 > 其他好文 > 详细

北京交通大学研究生教务处爬虫

时间:2015-04-11 19:17:32      阅读:285      评论:0      收藏:0      [点我收藏+]

标签:

 1 import urllib
 2 import urllib2
 3 import requests
 4 import re
 5 
 6 student = 八位学号
 7 password = 密码
 8 postdata = urllib.urlencode({
 9     u:student,
10     p:password
11     })
12 
13 user_agent = Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)
14 headers = {User-Agent : user_agent }
15 
16 REQUEST = requests.session()
17 REQUEST.post(
18     url = http://gsdb.bjtu.edu.cn/client/login/,
19     data = postdata)
20 
21 ##get the scores
22 returnPage = REQUEST.get(http://gsdb.bjtu.edu.cn/score/history/)
23 
24 ##把成绩抓出来
25 reScore = re.compile(r<tr>.*?</tr>, re.S)
26 resultList = reScore.findall(returnPage.text)
27 
28 Points = []
29 Scores = []
30 total = 0 
31 for res in resultList:
32     td = re.compile(r<td>.*?</td>, re.S)
33     tdList = td.findall(res)
34     if (tdList[6]==<td>学位课</td>.decode(utf-8)):
35         num = re.compile(r\d+)
36         point = num.findall(tdList[8])
37         score = num.findall(tdList[10])
38         Points.append((int)(point[0]))
39         Scores.append((int)(score[0]))
40         total += (int)(point[0]) * (int)(score[0])
41         
42 if sum(Points)!=0:
43     print str(student) +  score is:  + str(total / sum(Points))
44 else:
45     print Can\‘t get scores

 

北京交通大学研究生教务处爬虫

标签:

原文地址:http://www.cnblogs.com/asukayui/p/4418208.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!