码迷,mamicode.com
首页 > 其他好文 > 详细

automate the boring stuff_chapter11

时间:2019-10-26 17:11:04      阅读:63      评论:0      收藏:0      [点我收藏+]

标签:down   save   put   modules   desc   keyboard   object   open   running   

exampleFile = open(example.html)
print(exampleFile)
print(**********)
soup = bs4.BeautifulSoup(exampleFile,html.parser)
print(type(soup))
print(2**********)
print(soup)
print(3**********)
print(soup.title)
print(soup.title.string)
print(soup.title.parent.name)
print(4**********)
# print(soup.p)
# print(soup.p[‘class‘])
print(soup.p.a)
print(soup.p.a[href])
print(soup.find_all(p))
for link in soup.find_all(a):
    print(link.get(href,None))

 

 

soup = bs4.BeautifulSoup(exampleFile.read(),lxml)
elems = soup.select(#author) #the value of id of a tag is auther
print(elems)
print(type(elems))
print(len(elems))
print(elems[0])
print(******)
print(elems[0].get_text())
print(elems[0].getText())
print(elems[0].string)
print(elems[0].text)
print(******)
print(elems[0].attrs)
print(elems[0].attrs.get(id,None))
# .get_text is a function that returns the text of a tag as a string
# .text is a property that calls get_text (so it‘s identical, except you don‘t use parantheses)
# .getText is an alias of get_text

 

 

 

课后习题

1. Brie?y describe the differences between the webbrowser, requests, BeautifulSoup, and selenium modules.
Answer:
The webbrowser has an open() method that will launch a web browser to a speci?c URL, and that’s it.
The requests module can download ?les and pages from the Web. The BeautifulSoup module parses HTML.
The selenium module can launch and control a browser.
2. What type of object is returned by requests.get()? How can you access the downloaded content as a string value?
Answer: The requests.get() returns a Response object, which has a text attribute that contains the downloaded content as a string.
3. What Requests method checks that the download worked?
Answer: The raise_for_status() raises an exception if the download had problems and does nothing if the download succeeded.
4. How can you get the HTTP status code of a Requests response?
Answer: The status_code attribute of the Response object contains the HTTP status code.
5. How do you save a Requests response to a ?le?
Answer: After opening the new ?le on your computer in ‘wb‘ “write binary” mode, use a for loop that iterates over the Response object’s iter_content() method to write out chunks to the ?le. Here’s an example:
saveFile = open(‘filename.html‘, ‘wb‘) for chunk in res.iter_content(100000): saveFile.write(chunk)
6. What is the keyboard shortcut for opening a browser’s developer tools?
Answer: F12 brings up the developer tools in Chrome. Pressing CTRL-SHIFT-C (on Windows and Linux) or z-OPTION-C (on OS X) brings up the developer tools in Firefox.
7. How can you view (in the developer tools) the HTML of a speci?c element on a web page?
Answer: Right-click the element in the page, and select Inspect Element from the menu.
8. What is the CSS selector string that would ?nd the element with an id attribute of main?
Answer: ‘#main‘
9. What is the CSS selector string that would ?nd the elements with a CSS class of highlight?
Answer: ‘.highlight‘
10. What is the CSS selector string that would ?nd all the <div> elements inside another <div> element?
Answer: ‘div div‘
11. What is the CSS selector string that would ?nd the <button> element with a value attribute set to favorite?
Answer: ‘button[value="favorite"]‘
12. Say you have a Beautiful Soup Tag object stored in the variable spam for the element <div>Hello world!</div>. How could you get a string ‘Hello world!‘ from the Tag object?
Answer: spam.getText()
13. How would you store all the attributes of a Beautiful Soup Tag object in a variable named linkElem?
Answer: linkElem.attrs
14. Running import selenium doesn’t work. How do you properly import the selenium module?
Answer: The selenium module is imported with from selenium
import webdriver.
15. What’s the difference between the find_element_* and find_elements_* methods?
Answer: The find_element_* methods return the ?rst matching element as a WebElement object. The find_elements_* methods return a list of all matching elements as WebElement objects.
16. What methods do Selenium’s WebElement objects have for simulating mouse clicks and keyboard keys?
Answer: The click() and send_keys() methods simulate mouse clicks and keyboard keys, respectively.
17. You could call send_keys(Keys.ENTER) on the Submit button’s WebElement object, but what is an easier way to submit a form with Selenium?
Answer: Calling the submit() method on any element within a form submits the form.
18. How can you simulate clicking a browser’s Forward, Back, and Refresh buttons with Selenium?
Answer: The forward(), back(), and refresh() WebDriver object methods simulate these browser buttons.

automate the boring stuff_chapter11

标签:down   save   put   modules   desc   keyboard   object   open   running   

原文地址:https://www.cnblogs.com/shshp/p/11743591.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!