So I am trying to open websites on new tabs inside my WebDriver. I want to do this, because opening a new WebDriver for each website takes about 3.5secs using PhantomJS, I want more speed...
I'm using a multiprocess python script, and I want to get some elements from each page, so the workflow is like this:
Open Browser
Loop throught my array
For element in array -> Open website in new tab -> do my business -> close it
But I can't find any way to achieve this.
Here's the code I'm using. It takes forever between websites, I need it to be fast... Other tools are allowed, but I don't know too many tools for scrapping website content that loads with JavaScript (divs created when some event is triggered on load etc) That's why I need Selenium... BeautifulSoup can't be used for some of my pages.
#!/usr/bin/env python
import multiprocessing, time, pika, json, traceback, logging, sys, os, itertools, urllib, urllib2, cStringIO, mysql.connector, shutil, hashlib, socket, urllib2, re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from PIL import Image
from os import listdir
from os.path import isfile, join
from bs4 import BeautifulSoup
from pprint import pprint
def getPhantomData(parameters):
try:
# We create WebDriver
browser = webdriver.Firefox()
# Navigate to URL
browser.get(parameters['target_url'])
# Find all links by Selector
links = browser.find_elements_by_css_selector(parameters['selector'])
result = []
for link in links:
# Extract link attribute and append to our list
result.append(link.get_attribute(parameters['attribute']))
browser.close()
browser.quit()
return json.dumps({'data': result})
except Exception, err:
browser.close()
browser.quit()
print err
def callback(ch, method, properties, body):
parameters = json.loads(body)
message = getPhantomData(parameters)
if message['data']:
ch.basic_ack(delivery_tag=method.delivery_tag)
else:
ch.basic_reject(delivery_tag=method.delivery_tag, requeue=True)
def consume():
credentials = pika.PlainCredentials('invitado', 'invitado')
rabbit = pika.ConnectionParameters('localhost',5672,'/',credentials)
connection = pika.BlockingConnection(rabbit)
channel = connection.channel()
# Conectamos al canal
channel.queue_declare(queue='com.stuff.images', durable=True)
channel.basic_consume(callback,queue='com.stuff.images')
print ' [*] Waiting for messages. To exit press CTRL^C'
try:
channel.start_consuming()
except KeyboardInterrupt:
pass
workers = 5
pool = multiprocessing.Pool(processes=workers)
for i in xrange(0, workers):
pool.apply_async(consume)
try:
while True:
continue
except KeyboardInterrupt:
print ' [*] Exiting...'
pool.terminate()
pool.join()
After struggling for so long the below method worked for me:
This is a common code adapted from another examples:
the possible ways were:
Sending
<CTRL> + <T>
to one elementSending
<CTRL> + <T>
via Action chainsExecute a javascript snippet
In order to achieve this you need to ensure that the preferences browser.link.open_newwindow and browser.link.open_newwindow.restriction are properly set. The default values in the last versions are ok, otherwise you supposedly need:
the problem is that those preferences preset to other values and are frozen at least selenium 3.4.0. When you use the profile to set them with the java binding there comes an exception and with the python binding the new values are ignored.
In Java there is a way to set those preferences without specifying a profile object when talking to geckodriver, but it seem to be not implemented yet in the python binding:
The third option did stop working for python in selenium 3.4.0.
The first two options also did seem to stop working in selenium 3.4.0. They do depend on sending CTRL key event to an element. At first glance it seem that is a problem of the CTRL key, but it is failing because of the new multiprocess feature of Firefox. It might be that this new architecture impose new ways of doing that, or maybe is a temporary implementation problem. Anyway we can disable it via:
... and then you can use successfully the first way.
Where browser is the webDriver
You can achieve the opening/closing of a tab by the combination of keys COMMAND + T or COMMAND + W (OSX). On other OSs you can use CONTROL + T / CONTROL + W.
In selenium you can emulate such behavior. You will need to create one webdriver and as many tabs as the tests you need.
Here it is the code.
With Selenium v3.x opening a website in New Tab through Python is much easier now. Here is a solution where you can open
http://www.google.co.in
in the initial TAB andhttps://www.yahoo.com
in the adjacent TAB:Code Block:
Console Output:
Browser Snapshot: