码迷,mamicode.com
首页 > 其他好文 > 详细

bs4取数

时间:2017-06-17 14:22:57      阅读:137      评论:0      收藏:0      [点我收藏+]

标签:hidden   append   方式   list   orm   rom   引号   排名   odi   

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup

def file2soup(ffile):
with open(ffile,‘r+b‘) as f:
f.encoding=‘utf-8‘
soup=BeautifulSoup(f,‘lxml‘)
# print (soup)
return soup
pass

# def soup2list(soup,ulist):
# # print (soup.div)
# for i in soup.find(‘tbody‘).children:
# if isinstance(i,bs4.element.Tag):
# o=i(‘td‘)
# ulist.append([o[0].string,o[1].string,o[3].string])
# return ulist

# def soup2list(soup,ulist):
# # print (soup.div)
# for i in soup.find(‘tbody‘).children:
# if len(i)>1:#去空行
# o=i(‘td‘)#取出td标签 "td"加引号 o:<class ‘bs4.element.ResultSet‘> !!!
# a,b,c=0,1,3
# ulist.append([o[a].string,o[b].string,o[c].string])
# return ulist

def soup2list(soup,ulist):
i=soup.find(‘tbody‘,id="hidden_zhpm")
# print (len(i))
print (i)



url = ‘http://bj.58.com/pinpaigongyu/pn/{ppp}/?minprice=2000_4000‘
ffile=‘d://best.txt‘
ulist=[]

soup=file2soup(ffile)
ulist=soup2list(soup,ulist)
# print (ulist)
# print (‘{:0>3}\t {:+<15}\t {: >5}\t‘.format(‘排名‘,‘校名‘,‘总分‘))
# for i in range(11):
# u=ulist[i]
# #{:起头+一个填充符+对齐方式+字符长度}
# print (‘{:0>3}\t {:+<15}\t {: >5}\t‘.format(u[0], u[1], u[2] ))

bs4取数

标签:hidden   append   方式   list   orm   rom   引号   排名   odi   

原文地址:http://www.cnblogs.com/e35t/p/7040051.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!