Selenium WebDriver Classic vs Selenium WebDriver BiDi
Japneet Sachdeva
Become a successful SDET with my Road To Full Stack QA Courses, Guides & Newsletters
WebDriver BiDi overview for Test Automation Engineers who interact with Web Browsers, Test Web Apps and Plan for the future.
Selenium WebDriver W3C (WD) uses HTTP and works like below:
Selenium WebDriver BiDi (WDB) uses WebSockets and works like below:
A major shift from current structure and process mechanism of Selenium WebDriver W3C to Selenium WebDriver BiDi. Let's first understand the internal working through a infographic below:
As we can clearly see, W3C uses a Request & Response (HTTP based protocol) mechanism which terminates after each Selenium WebDriver command execution. Obviously this has its limitation like below:
Now with Selenium WebDriver BiDi
It leverages DevTools protocol which indeed uses WebSockets (if not already read the above description). These DevTools are integrated into modern browsers like Chrome, Edge, Safari and Firefox.
Helps in:
1) Check the look and feel of our application
2) Verify that correct styles are applied
3) View the HTML and DOM structure at a high level
4) Drill down into specific details as needed
Now with Selenium WebDriver BiDi, it supports CDP which is Chrome DevTools Protocol which helps to communicate to chromium based browsers using WebSockets.
Why only chromium based browsers?
Because CDP engine is specific to chromium and Selenium team is still looking for fully implemented BiDi protocol yet! meaning this is still in development process to reduce the overhead of deviating our code changes.
Below is a code example for leveraging CDP:
The Selenium project provides support for Chrome DevTools. For example, we can set a cookie with the help of CDP Network API
public void setCookie() {
ChromeDriver driver = new ChromeDriver();
DevTools devTools = driver.getDevTools();
devTools.createSession();
devTools.send(
Network.setCookie(
"cheese",
"gouda", );
driver.get("https://www.selenium.dev");
Cookie cheese = driver.manage().getCookieNamed("cheese");
Assertions.assertEquals("gouda", cheese.getValue());
driver.quit();
}
Let’s walk through this code sample step by step for better comprehension. First, we initialize the ChromeDriver. A simple session gets created with that driver.
ChromeDriver driver = new ChromeDriver();
We get the DevTools instance (for later use).
DevTools devTools = driver.getDevTools();
Then we establish a session, which involves starting the initial WebSocket handshake connection. When we call createSession(), this sets up the WebSocket connection.
devTools.createSession();
Now we move on to actually sending the commands. We send a command by performing devTools.send().
领英推荐
devTools.send(
Network.setCookie(
"cheese",
"gouda",);
After that, we can use traditional Selenium WD [Classic] commands to go to a particular URL, verify the results, and quit the session.
driver.get("https://www.selenium.dev");
Cookie cheese = driver.manage().getCookieNamed("cheese");
Assertions.assertEquals("gouda", cheese.getValue());
driver.quit();
We can also use CDP with printing console logs for better resolution of issues related to flakiness or interim network challenges.
Console Logs
Let’s review another CDP domain supported by Selenium — the Logging Domain:
ChromeDriver driver = new ChromeDriver();
DevTools devTools = driver.getDevTools();
devTools.createSession();
devTools.send(Log.enable());
devTools.addListener(Log.entryAdded(),
logEntry -> (
System.out.println(“log: ” + logEntry.getText());
System.out.println(“level: ” + logEntry.getLevel());
));
driver.get(“https://example.com”);
// Check the terminal output for the browser console messages.
driver.quit();
We enable the Log-related domain. This is the instruction which does not return any information, it merely instructs to enable the Log domain.
devTools.send(Log.enable());
Now we add a listener. Adding a listener is what allows us to listen to events.
devTools.addListener(Log.entryAdded(),
logEntry -> (
System.out.println(“log: ” + logEntry.getText());
System.out.println(“level: ” + logEntry.getLevel());
));
We tell the script to inform us when any entry of log events happens on the server.
Read more here: CDP Documentation
One source to check for the Selenium CDP documentation is https://www.selenium.dev/documentation/webdriver/bidi/cdp/
Selenium WebDriver W3C vs Selenium WebDriver BiDi
Selenium WebDriver W3C is a standard protocol designed according to the W3C specification. Provides multiple language bindings for full flexibility.
CDP is a protocol, but it only supports Chromium-based browsers, such as Chrome and Edge.
Selenium WebDriver W3C starts an HTTP server in the back-end and sends the commands to the browser driver. The driver carries these instructions on to the browser. Communication happens via the traditional HTTP response/request protocol in the Rest API format. To wait for an element, we do long-polling asking the server if an element if available, often multiple times.
CDP uses a WebSocket which is bidirectional in nature. WebSocket has the capacity to send the commands and concurrently listen to the events/messages from the server in real time.
Selenium WebDriver W3C can perform operations in the browser UI, but cannot perform those operations in the DevTools console. It can’t control the DevTools programmatically. We cannot access network requests, console components, errors or events that happen in the DevTools.
CDP has a the power of accessing the browser DevTools. It can get the messages or errors from the console, mock the network requests, or wait until the DOM changes.
WD BiDi = CDP + Selenium WebDriver W3C
Only for Chromium based browsers only! (currently)
-x-x-
Subscribe to my YouTube channel for more updates: https://www.youtube.com/@IamJapneetSachdeva
Job Guarantee Full Stack QA & Automation Course, Get your next job as an SDET in 2025: Get access here
#japneetsachdeva
C# Test Automation Architect | Mentor
1 个月Actually you are wrong it doesn't use Devtools. It's being rebuilt around the WebDriver standard to utilise websockets similar to how Playwright uses it via CDP. This time all browsers will have the same protocol for driving them in a bidirectional manner. Also playwright is also experimenting in implementing it too. https://github.com/microsoft/playwright/pulls?q=bidi
Manager - Projects / Automation at Cognizant | Leading Automation Projects
1 个月Insightful and Thanks for the detailed write-up!
Bughunter, Testing and Quality Assurance Specialist in Tech | Skilled in Cross-Disciplinary Projects | Expert in FinTech, Telecom, Media | Focused on Long-term Client Satisfaction & Team Innovation
1 个月Insightful ?? thank you for sharing
Passionate about Software testing, QA and technology.
1 个月Exciting times ahead in the SDET community with these updates. ??