今天用python3.x在解析网页时发生了个错误:"UnicodeEncodeError: ‘ascii‘ codec can‘t encode characters",下面是错误信息:
C:\Python\Python35-32\python.exe C:/Users/XXXXXXX/XXXX.py Traceback (most recent call last): File "C:/Users/XXXXXXX/XXXX.py", line 22, in <module> text = gethtml(searchUrl) File "C:/Users/XXXXXXX/XXXX.py", line 10, in gethtml htmlData = urllib.request.urlopen(url) File "C:\Python\Python35-32\lib\urllib\request.py", line 162, in urlopen return opener.open(url, data, timeout) File "C:\Python\Python35-32\lib\urllib\request.py", line 465, in open response = self._open(req, data) File "C:\Python\Python35-32\lib\urllib\request.py", line 483, in _open ‘_open‘, req) File "C:\Python\Python35-32\lib\urllib\request.py", line 443, in _call_chain result = func(*args) File "C:\Python\Python35-32\lib\urllib\request.py", line 1268, in http_open return self.do_open(http.client.HTTPConnection, req) File "C:\Python\Python35-32\lib\urllib\request.py", line 1240, in do_open h.request(req.get_method(), req.selector, req.data, headers) File "C:\Python\Python35-32\lib\http\client.py", line 1083, in request self._send_request(method, url, body, headers) File "C:\Python\Python35-32\lib\http\client.py", line 1118, in _send_request self.putrequest(method, url, **skips) File "C:\Python\Python35-32\lib\http\client.py", line 960, in putrequest self._output(request.encode(‘ascii‘)) UnicodeEncodeError: ‘ascii‘ codec can‘t encode characters in position 17-20: ordinal not in range(128) Process finished with exit code 1
所以很明显是因为在发送的请求中有字符集的问题。
下面是code片段:
def gethtml(url): htmlData = urllib.request.urlopen(url) htmlText = htmlData.read() return htmlText print gethtml("http://xxxx/测试")
换为:
def gethtml(url): htmlData = urllib.request.urlopen(url) htmlText = htmlData.read() return htmlText print gethtml("http://xxxx/%s"%urllib.parse.quote("测试"))
本文出自 “lybing” 博客,请务必保留此出处http://lybing.blog.51cto.com/3286625/1792396
UnicodeEncodeError: 'ascii' codec can't encode characters
原文地址:http://lybing.blog.51cto.com/3286625/1792396