码迷,mamicode.com
首页 > 其他好文 > 详细

爬虫练习

时间:2018-10-27 00:05:12      阅读:138      评论:0      收藏:0      [点我收藏+]

标签:爬虫   for   tpc   oca   als   head   code   json   ror   

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import urllib.request#这里必须要加request
import urllib.parse
import requests
import sys
#sys.Setdefaultencoding(‘utf-8‘)
import urllib
import json
import time
from bs4 import BeautifulSoup
#发起GET请求
# url = ‘http://kaoshi.edu.sina.com.cn/college/scorelist?tab=batch&wl=1&local=2&batch=&syear=2013‘
# response = urllib.request.urlopen(url=url)
# result = response.read().decode(‘utf-8‘)#解码后可以正常输出
# print(result)
#发起POST请求
url = "http://shuju.wdzj.com/plat-info-59.html"
data = urllib.parse.urlencode({type1:x,type2:0,status:0}).encode(utf-8)
request = urllib.request.Request(url=url,data=data)
#opener = urllib.build_open(urllib.HTTPCookieProcessor()) #跟上述差不多,只是了一个data
response = urllib.request.urlopen(request)
result = response.read().decode(utf-8)
print(result)
result = result.replace(<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">, ‘‘)
result = result.replace(</pre></body></html>, ‘‘)
for key in json.loads(result,strict=False).keys():
    print(key)
#报错:json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

 

爬虫练习

标签:爬虫   for   tpc   oca   als   head   code   json   ror   

原文地址:https://www.cnblogs.com/lifengwu/p/9858998.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!