Selenium: How to "Mount" an Existing Driver

Background

Lately I came across a problem, is it possible to create a new Selenium/Appium driver that will use an existing open driver, without passing the driver state in the solution hierarchy?

Meaning, create a totally new driver from an existing one and continue to run on the existing driver session from a new driver object.

How Selenium Client Actually Works?

In order to understand how to "reconnect" to an existing driver, first we must understand how Selenium client works.

Basically, Selenium is just a client that builds and sends HTTP requests. Selenium builds the requests to comply with W3 WebDriver Protocol and than sends them to the driver (Chrome, FF, IE, Edge, Android, etc.). The driver, in turn, invokes the automation against the application under test.

No alt text provided for this image

Solution

The solution requires a bit of hacking, since the Selenium client implementation is not exposing the internal information of the session and the driver service. Furthermore, the client implementation is very different between languages (e.g. Python, Java, C#, etc.), which makes the driver "mount/reconnect" implementation very different between languages.

Link for the code can be found here: savanna-projects/automation-examples.

The link provided contains examples in Python, Java and C#.

Session & Endpoint

The commands sent to the driver are formatted in the following manner:

<http|https>://<address>:<port>/(wd/hub)/session/<session_id>/<command>        

In order to "mount" an existing driver, we need to accomplish the following:

  1. Get the address and the port of the server.
  2. Get the session id of the currently open browser.
  3. Get the capabilities of the currently open browser (for C# and Java).
  4. Override the "execute" method of the driver command executor (to avoid a shadow driver each time we "mount" an existing driver).
  5. Create a new remote driver with the new executor pointing the same address with the same session id.

Extracting the Information (Python Examples)

Unfortunately, the information we need in order to "mount" a driver, is private and was not designed to tempered with.

Languages like C# and Java which are pure OO languages with strict permissions system, makes it harder to get the information and it requires knowledge with reflection. Please see the link above for examples of how to "mount" a driver with C# and Java.

Code Snippet: get session id and server address from existing driver

url = driver.command_executor._url
session_id = driver.session_id        

The url and the session id can be taken directly from the driver. Please note that the url property is "private". It will not show in the intellisense and might trigger an IDE warning for accessing a private field.

Code Snippet: override the execute method and avoid "newSession" command

# the original executor from the WebDriver state
execute = WebDriver.execute

# override newSession command
def local_executor(self, command, params=None):
    if command != "newSession":
        return execute(self, command, params)
    return {'success': 0, 'value': None, 'sessionId': session_id}

# mount
WebDriver.execute = local_executor        

Python allows us to treat functions as any other member and override them as you would have with a string or a number.

We are creating a new executor method to override the original one, so we can avoid the "newSession" command implemented in the WebDriver base class.

This is a very important step. If we will not implement it, another empty browser will open each time we "mount" an existing one (e.g. shadow browser).

Code Snippet: "mount" the existing driver

# start with browser one
browser_one: WebDriver = webdriver.Chrome("chromedriver.exe")
browser_one.maximize_window()

# mount browser one with browser two
browser_two = mount_session(driver=browser_one)
browser_two.get('https://www.google.com')
browser_one.quit()        

Putting all Together

Code Snippet: end to end driver creation & "mount"

from selenium import webdriver
from selenium.webdriver.remote.webdriver import WebDriver


def mount_session(driver: WebDriver) -> WebDriver:
    # setup
    url = driver.command_executor._url
    session_id = driver.session_id
    
    # the original executor from the WebDriver state
    execute = WebDriver.execute

    # override newSession command
    def local_executor(self, command, params=None):
        if command != "newSession":
            return execute(self, command, params)
        return {'success': 0, 'value': None, 'sessionId': session_id}

    # mount
    WebDriver.execute = local_executor
    new_driver = webdriver.Remote(command_executor=url, desired_capabilities={})
    new_driver.session_id = session_id

    # restore original functionality
    WebDriver.execute = execute

    # get
    return new_driver


# start with browser one
browser_one: WebDriver = webdriver.Chrome("chromedriver.exe")
browser_one.maximize_window()

# mount browser one with browser two
browser_two = mount_session(driver=browser_one)
browser_two.get('https://www.google.com')
browser_one.quit()        

要查看或添加评论,请登录

Roei Sabag的更多文章

  • UI Automation - Locator Strategies

    UI Automation - Locator Strategies

    During my career I have learned and seen many ways and techniques to implement locators in order to interact with the…

    1 条评论

社区洞察

其他会员也浏览了