基本信息
源码名称:python +headless+chrome 模拟搜索百度关键词
源码大小:0.14M
文件格式:.zip
开发语言:Python
更新时间:2018-06-09
友情提示:(无需注册或充值,赞助后即可获取资源下载链接)
嘿,亲!知识可是无价之宝呢,但咱这精心整理的资料也耗费了不少心血呀。小小地破费一下,绝对物超所值哦!如有下载和支付问题,请联系我们QQ(微信同号):813200300
本次赞助数额为: 2 元×
微信扫码支付:2 元
×
请留下您的邮箱,我们将在2小时内将文件发到您的邮箱
源码介绍
#!/usr/bin/env python # -*- coding:utf-8 -*- from selenium import webdriver from selenium.webdriver.common.proxy import Proxy, ProxyType options = webdriver.ChromeOptions() # tell selenium to use the dev channel version of chrome # NOTE: only do this if you have a good reason to # options.binary_location = '/usr/bin/google-chrome-unstable' # path to google Chrome bin options.add_argument('headless') # set the window size options.add_argument('window-size=1200x600') # with proxy proxy_url = 'ip:port' proxy = Proxy({ 'proxyType': ProxyType.MANUAL, 'httpProxy': proxy_url, 'sslProxy': proxy_url # 需要信任代理服务器CA证书 }) desired_capabilities = options.to_capabilities() proxy.add_to_capabilities(desired_capabilities) # initialize the driver # driver = webdriver.Chrome(chrome_options=options) driver = webdriver.Chrome(chrome_options=options, desired_capabilities=desired_capabilities) driver.get('https://www.baidu.com') # wait up to 10 seconds for the elements to become available driver.implicitly_wait(10) driver.get_screenshot_as_file('baidu_index.png') # use css selectors to grab the search inputs text = driver.find_element_by_css_selector('#kw') search = driver.find_element_by_css_selector('#su') text.send_keys('headless chrome') driver.get_screenshot_as_file('baidu_main-page.png') # search search.click() driver.get_screenshot_as_file('search-result.png') results = driver.find_elements_by_xpath('//div[@class="result c-container "]') for result in results: res = result.find_element_by_css_selector('a') title = res.text link = res.get_attribute('href') print ('Title: %s \nLink: %s\n' % (title, link))