码迷,mamicode.com
首页 > Web开发 > 详细

day1 UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 2490: illegal multibyte sequence 错误提示

时间:2017-09-28 20:49:23      阅读:344      评论:0      收藏:0      [点我收藏+]

标签:code   bs4   tps   ror   rom   text   错误提示   csharp   png   

get方式得到网页的信息

  技术分享

 

#coding=utf-8
#pip install requests

#直接get到网页的信息
import requests
from bs4 import BeautifulSoup

response = requests.get(‘https://www.sogou.com/web?query=搞基建‘)
print(response.text)  #打印搜索出来的全部信息

#从 response.text  找出   <div class = ‘wrwrap> </div>
soup = BeautifulSoup(response.text,‘html.parser‘)
new_list = soup.find_all(name=‘div‘,class_=‘vrwrap‘)
print(new_list)

#可以继续从 <div class = ‘wrwrap> </div>  继续查找

  

 

  

 

1.错误代码

Traceback (most recent call last):
  File "D:/PycharmProjects/爬虫/day1/s1.py", line 12, in <module>
    print(new_list)
UnicodeEncodeError: ‘gbk‘ codec can‘t encode character ‘\xa0‘ in position 2490: illegal multibyte sequence

 

  技术分享

 

 

2.编码格式不对

  技术分享

 

3.全部改为utf-8

   技术分享

 

4.执行成功

  技术分享

 

day1 UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 2490: illegal multibyte sequence 错误提示

标签:code   bs4   tps   ror   rom   text   错误提示   csharp   png   

原文地址:http://www.cnblogs.com/venicid/p/7608181.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!