Selenium WebDriver Classic vs Selenium WebDriver BiDi

Selenium WebDriver Classic vs Selenium WebDriver BiDi

WebDriver BiDi overview for Test Automation Engineers who interact with Web Browsers, Test Web Apps and Plan for the future.

Selenium WebDriver W3C (WD) uses HTTP and works like below:

  • WD commands are generally synchronous in nature. It means that the client sends and HTTP request and waits for a response from the browser server before proceeding to the next command.
  • E.g., if we want to click a button, first we need to verify that the button is enabled and is clickable, and then perform the click action. To achieve this, WD sends 3 synchronous requests one-by-one in order to make sure that the element is 1) visible and 2) clickable, and 3) performs the click action.
  • Due to its synchronous nature, WD waits until an operation is processed on the server side — this is a performance concern.

Selenium WebDriver BiDi (WDB) uses WebSockets and works like below:

  • WebDriver BiDi, or BiDirectional, represents a significant evolution from WD in browser automation. Unlike WD, which relies on HTTP, BiDi uses JSON payloads over WebSockets. This shift enables direct communication between the automation script and the browser without the need for a separate browser driver. Commands are sent directly to the browser, which responds with success or failure notifications.
  • The bidirectional nature of the protocol allows browsers to send events back to the automation tool asynchronously, opening up new possibilities for automation scenarios. This capability mirrors the functionality of CDP but extends it to a standardized approach that aims to be browser-agnostic, not limited to Chromium-based browsers.


A major shift from current structure and process mechanism of Selenium WebDriver W3C to Selenium WebDriver BiDi. Let's first understand the internal working through a infographic below:


Selenium WebDriver W3C vs Selenium WebDriver BiDi


As we can clearly see, W3C uses a Request & Response (HTTP based protocol) mechanism which terminates after each Selenium WebDriver command execution. Obviously this has its limitation like below:

  • Modern web-apps have lots of internal states which popularly uses JavaScript for such managements. These scripts invoke additional I/O operations in the form of network requests. Which leads to non-deterministic behavior called flakiness.
  • Lacks internal browser state management like console logs, network traffic etc.
  • Simulating user interactions are fine but they no longer work with command/response model instead they have to maintain sessions (states) for efficient workflow tests.


Now with Selenium WebDriver BiDi

It leverages DevTools protocol which indeed uses WebSockets (if not already read the above description). These DevTools are integrated into modern browsers like Chrome, Edge, Safari and Firefox.

Helps in:

1) Check the look and feel of our application

2) Verify that correct styles are applied

3) View the HTML and DOM structure at a high level

4) Drill down into specific details as needed

Now with Selenium WebDriver BiDi, it supports CDP which is Chrome DevTools Protocol which helps to communicate to chromium based browsers using WebSockets.


Why only chromium based browsers?

Because CDP engine is specific to chromium and Selenium team is still looking for fully implemented BiDi protocol yet! meaning this is still in development process to reduce the overhead of deviating our code changes.

Below is a code example for leveraging CDP:

The Selenium project provides support for Chrome DevTools. For example, we can set a cookie with the help of CDP Network API

public void setCookie() {

  ChromeDriver driver = new ChromeDriver();
  DevTools devTools = driver.getDevTools();
  devTools.createSession();
  
  devTools.send(
              Network.setCookie(
                      "cheese",
                      "gouda", );
  
  driver.get("https://www.selenium.dev");
  Cookie cheese = driver.manage().getCookieNamed("cheese");
  Assertions.assertEquals("gouda", cheese.getValue());
  driver.quit();
}        

Let’s walk through this code sample step by step for better comprehension. First, we initialize the ChromeDriver. A simple session gets created with that driver.

ChromeDriver driver = new ChromeDriver();        

We get the DevTools instance (for later use).

DevTools devTools = driver.getDevTools();        

Then we establish a session, which involves starting the initial WebSocket handshake connection. When we call createSession(), this sets up the WebSocket connection.

devTools.createSession();        

Now we move on to actually sending the commands. We send a command by performing devTools.send().

devTools.send(
            Network.setCookie(
                    "cheese",
                    "gouda",);        

After that, we can use traditional Selenium WD [Classic] commands to go to a particular URL, verify the results, and quit the session.

driver.get("https://www.selenium.dev");
Cookie cheese = driver.manage().getCookieNamed("cheese");
Assertions.assertEquals("gouda", cheese.getValue());
driver.quit();        

We can also use CDP with printing console logs for better resolution of issues related to flakiness or interim network challenges.

Console Logs

Let’s review another CDP domain supported by Selenium — the Logging Domain:

ChromeDriver driver = new ChromeDriver();
DevTools devTools = driver.getDevTools();
devTools.createSession();
devTools.send(Log.enable());
devTools.addListener(Log.entryAdded(),
       logEntry -> (
        System.out.println(“log: ” + logEntry.getText());
        System.out.println(“level: ” + logEntry.getLevel());
        ));
driver.get(“https://example.com”);
// Check the terminal output for the browser console messages.
driver.quit();        

We enable the Log-related domain. This is the instruction which does not return any information, it merely instructs to enable the Log domain.

devTools.send(Log.enable());        

Now we add a listener. Adding a listener is what allows us to listen to events.

devTools.addListener(Log.entryAdded(),
	logEntry -> (
		System.out.println(“log: ” + logEntry.getText());
 		System.out.println(“level: ” + logEntry.getLevel());
));        

We tell the script to inform us when any entry of log events happens on the server.


Read more here: CDP Documentation

One source to check for the Selenium CDP documentation is https://www.selenium.dev/documentation/webdriver/bidi/cdp/


Selenium WebDriver W3C vs Selenium WebDriver BiDi

Selenium WebDriver W3C is a standard protocol designed according to the W3C specification. Provides multiple language bindings for full flexibility.

CDP is a protocol, but it only supports Chromium-based browsers, such as Chrome and Edge.

Selenium WebDriver W3C starts an HTTP server in the back-end and sends the commands to the browser driver. The driver carries these instructions on to the browser. Communication happens via the traditional HTTP response/request protocol in the Rest API format. To wait for an element, we do long-polling asking the server if an element if available, often multiple times.

CDP uses a WebSocket which is bidirectional in nature. WebSocket has the capacity to send the commands and concurrently listen to the events/messages from the server in real time.

Selenium WebDriver W3C can perform operations in the browser UI, but cannot perform those operations in the DevTools console. It can’t control the DevTools programmatically. We cannot access network requests, console components, errors or events that happen in the DevTools.

CDP has a the power of accessing the browser DevTools. It can get the messages or errors from the console, mock the network requests, or wait until the DOM changes.


WD BiDi = CDP + Selenium WebDriver W3C

Only for Chromium based browsers only! (currently)

  • 14/02/2025

-x-x-

Subscribe to my YouTube channel for more updates: https://www.youtube.com/@IamJapneetSachdeva

Job Guarantee Full Stack QA & Automation Course, Get your next job as an SDET in 2025: Get access here

#japneetsachdeva


Christian DeLaphante

C# Test Automation Architect | Mentor

1 个月

Actually you are wrong it doesn't use Devtools. It's being rebuilt around the WebDriver standard to utilise websockets similar to how Playwright uses it via CDP. This time all browsers will have the same protocol for driving them in a bidirectional manner. Also playwright is also experimenting in implementing it too. https://github.com/microsoft/playwright/pulls?q=bidi

回复
Thasarathan Saminathan

Manager - Projects / Automation at Cognizant | Leading Automation Projects

1 个月

Insightful and Thanks for the detailed write-up!

Olli Kulkki

Bughunter, Testing and Quality Assurance Specialist in Tech | Skilled in Cross-Disciplinary Projects | Expert in FinTech, Telecom, Media | Focused on Long-term Client Satisfaction & Team Innovation

1 个月

Insightful ?? thank you for sharing

Hussain Ahmed

Passionate about Software testing, QA and technology.

1 个月

Exciting times ahead in the SDET community with these updates. ??

要查看或添加评论,请登录

Japneet Sachdeva的更多文章

  • What is a bug? | Everything about Bugs a QA should know!

    What is a bug? | Everything about Bugs a QA should know!

    First thing first, let's quickly define it and jump to real world picture understanding A bug is an error or flaw in an…

  • Complete Front End Testing Guide for 2025

    Complete Front End Testing Guide for 2025

    Front End Testing is crucial for delivering a high quality product which functions well and meets user expectations…

    2 条评论
  • Earn 1 Lakh per month using Generative AI | No Clickbait

    Earn 1 Lakh per month using Generative AI | No Clickbait

    The actual possibility to create a side-income in 2025 is really true. If you know "How to generate value" then…

    3 条评论
  • AI Assisted Testing | AI Powered Testing | AI Agents for Testing

    AI Assisted Testing | AI Powered Testing | AI Agents for Testing

    Instead of using complicated terms, let's keep it simple. It's nothing but AI-Driven Testing.

    2 条评论
  • Decoding Test Pyramid for Upcoming SDETs

    Decoding Test Pyramid for Upcoming SDETs

    Software testing is a complicated process, until we figure out what can be automated and what should be kept as part of…

    3 条评论
  • State Transition Testing

    State Transition Testing

    ISTQB definition: State transition testing (finite state testing) - a black-box test technique using a state transition…

    2 条评论
  • Chaos Monkey Tests by Netflix

    Chaos Monkey Tests by Netflix

    Netflix uses a technique or say system which purposefully throws it or breaks it in production or replicated production…

    1 条评论
  • How to approach APIs for exploratory Testing?

    How to approach APIs for exploratory Testing?

    Top API Testing Tools for 2025 Postman Bruno Insomnia Swagger Why API's Exploratory Testing is required? Early adoption…

    3 条评论
  • Top 4 API Authentications we should know!

    Top 4 API Authentications we should know!

    Application Programming Interface (API) the vital links that allow applications to exchange services and data—require…

    5 条评论
  • Design Pattern #1 Singleton Pattern

    Design Pattern #1 Singleton Pattern

    Design patterns are one of the most used solutions to improve a framework or code structure. Singleton pattern is part…

    1 条评论

社区洞察

其他会员也浏览了