Open web in new tab Selenium + Python

2019-01-04 11:13发布

So I am trying to open websites on new tabs inside my WebDriver. I want to do this, because opening a new WebDriver for each website takes about 3.5secs using PhantomJS, I want more speed...

I'm using a multiprocess python script, and I want to get some elements from each page, so the workflow is like this:

Open Browser

Loop throught my array
For element in array -> Open website in new tab -> do my business -> close it

But I can't find any way to achieve this.

Here's the code I'm using. It takes forever between websites, I need it to be fast... Other tools are allowed, but I don't know too many tools for scrapping website content that loads with JavaScript (divs created when some event is triggered on load etc) That's why I need Selenium... BeautifulSoup can't be used for some of my pages.

#!/usr/bin/env python
import multiprocessing, time, pika, json, traceback, logging, sys, os, itertools, urllib, urllib2, cStringIO, mysql.connector, shutil, hashlib, socket, urllib2, re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from PIL import Image
from os import listdir
from os.path import isfile, join
from bs4 import BeautifulSoup
from pprint import pprint

def getPhantomData(parameters):
    try:
        # We create WebDriver
        browser = webdriver.Firefox()
        # Navigate to URL
        browser.get(parameters['target_url'])
        # Find all links by Selector
        links = browser.find_elements_by_css_selector(parameters['selector'])

        result = []
        for link in links:
            # Extract link attribute and append to our list
            result.append(link.get_attribute(parameters['attribute']))
        browser.close()
        browser.quit()
        return json.dumps({'data': result})
    except Exception, err:
        browser.close()
        browser.quit()
        print err

def callback(ch, method, properties, body):
    parameters = json.loads(body)
    message = getPhantomData(parameters)

    if message['data']:
        ch.basic_ack(delivery_tag=method.delivery_tag)
    else:
        ch.basic_reject(delivery_tag=method.delivery_tag, requeue=True)

def consume():
    credentials = pika.PlainCredentials('invitado', 'invitado')
    rabbit = pika.ConnectionParameters('localhost',5672,'/',credentials)
    connection = pika.BlockingConnection(rabbit)
    channel = connection.channel()

    # Conectamos al canal
    channel.queue_declare(queue='com.stuff.images', durable=True)
    channel.basic_consume(callback,queue='com.stuff.images')

    print ' [*] Waiting for messages. To exit press CTRL^C'
    try:
        channel.start_consuming()
    except KeyboardInterrupt:
        pass

workers = 5
pool = multiprocessing.Pool(processes=workers)
for i in xrange(0, workers):
    pool.apply_async(consume)

try:
    while True:
        continue
except KeyboardInterrupt:
    print ' [*] Exiting...'
    pool.terminate()
    pool.join()

5条回答
欢心
2楼-- · 2019-01-04 11:49

After struggling for so long the below method worked for me:

driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.TAB)

windows = driver.window_handles

time.sleep(3)
driver.switch_to.window(windows[1])
查看更多
戒情不戒烟
3楼-- · 2019-01-04 11:51

This is a common code adapted from another examples:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.google.com/")

#open tab
# ... take the code from the options below

# Load a page 
driver.get('http://bings.com')
# Make the tests...

# close the tab
driver.quit()

the possible ways were:

  1. Sending <CTRL> + <T> to one element

    #open tab
    driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't')
    
  2. Sending <CTRL> + <T> via Action chains

    ActionChains(driver).key_down(Keys.CONTROL).send_keys('t').key_up(Keys.CONTROL).perform()
    
  3. Execute a javascript snippet

    driver.execute_script('''window.open("http://bings.com","_blank");''')
    

    In order to achieve this you need to ensure that the preferences browser.link.open_newwindow and browser.link.open_newwindow.restriction are properly set. The default values in the last versions are ok, otherwise you supposedly need:

    fp = webdriver.FirefoxProfile()
    fp.set_preference("browser.link.open_newwindow", 3)
    fp.set_preference("browser.link.open_newwindow.restriction", 2)
    
    driver = webdriver.Firefox(browser_profile=fp)
    

    the problem is that those preferences preset to other values and are frozen at least selenium 3.4.0. When you use the profile to set them with the java binding there comes an exception and with the python binding the new values are ignored.

    In Java there is a way to set those preferences without specifying a profile object when talking to geckodriver, but it seem to be not implemented yet in the python binding:

    FirefoxOptions options = new FirefoxOptions().setProfile(fp);
    options.addPreference("browser.link.open_newwindow", 3);
    options.addPreference("browser.link.open_newwindow.restriction", 2);
    FirefoxDriver driver = new FirefoxDriver(options);
    

The third option did stop working for python in selenium 3.4.0.

The first two options also did seem to stop working in selenium 3.4.0. They do depend on sending CTRL key event to an element. At first glance it seem that is a problem of the CTRL key, but it is failing because of the new multiprocess feature of Firefox. It might be that this new architecture impose new ways of doing that, or maybe is a temporary implementation problem. Anyway we can disable it via:

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.tabs.remote.autostart", False)
fp.set_preference("browser.tabs.remote.autostart.1", False)
fp.set_preference("browser.tabs.remote.autostart.2", False)

driver = webdriver.Firefox(browser_profile=fp)

... and then you can use successfully the first way.

查看更多
冷血范
4楼-- · 2019-01-04 11:52
browser.execute_script('''window.open("http://bings.com","_blank");''')

Where browser is the webDriver

查看更多
爷、活的狠高调
5楼-- · 2019-01-04 12:00

You can achieve the opening/closing of a tab by the combination of keys COMMAND + T or COMMAND + W (OSX). On other OSs you can use CONTROL + T / CONTROL + W.

In selenium you can emulate such behavior. You will need to create one webdriver and as many tabs as the tests you need.

Here it is the code.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.google.com/")

#open tab
driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 't') 
# You can use (Keys.CONTROL + 't') on other OSs

# Load a page 
driver.get('http://stackoverflow.com/')
# Make the tests...

# close the tab
# (Keys.CONTROL + 'w') on other OSs.
driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 'w') 


driver.close()
查看更多
再贱就再见
6楼-- · 2019-01-04 12:04

With Selenium v3.x opening a website in New Tab through Python is much easier now. Here is a solution where you can open http://www.google.co.in in the initial TAB and https://www.yahoo.com in the adjacent TAB:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get("http://www.google.co.in")
    print("Initial Page Title is : %s" %driver.title)
    windows_before  = driver.current_window_handle
    print("First Window Handle is : %s" %windows_before)
    driver.execute_script("window.open('https://www.yahoo.com')")
    WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2))
    windows_after = driver.window_handles
    new_window = [x for x in windows_after if x != windows_before][0]
    driver.switch_to_window(new_window)
    print("Page Title after Tab Switching is : %s" %driver.title)
    print("Second Window Handle is : %s" %new_window)
    
  • Console Output:

    Initial Page Title is : Google
    First Window Handle is : CDwindow-B2B3DE3A222B3DA5237840FA574AF780
    Page Title after Tab Switching is : Yahoo
    Second Window Handle is : CDwindow-D7DA7666A0008ED91991C623105A2EC4
    
  • Browser Snapshot:

multiple__tabs

查看更多
登录 后发表回答