Selenium Basic for Chrome browser
Vadim Tyuryaev
Data Scientist | PhD Candidate in Statistics | Executive MBA candidate | ML & AI Expert | Digital Innovation Advocate | International Educator |
Modern data science demands a high degree of flexibility, knowledge of multiple programming languages such as Python, SAS, and R, as well as, a plethora of skills and tools. One of said skills that I had to learn and use early in my career was Web Scraping. Another skill which will definitely not hurt knowing or utilizing is VBA. This tutorial will guide you through the installation process of Selenium Basic and its usage in tandem with Excel’s VBA and Chrome Browser. You will also be equipped with several real-world examples.
Requirements:
·??????Microsoft Excel
·??????Chrome browser
·??????Basic proficiency in VBA
·??????Basic understanding of web page elements inspection?
Installation:
1.????Install Selenium IDE?for Chrome.
2.????Install Selenium Basic.
3.????Access Visual Basic from the Developer Tab in Excel. If developer tab is not showing, check here.
4.????Add Selenium Type Library in the VBA project references (select References from the Tools menu). Note the Location, we will need it at the last step.?
5.????Download the latest Chrome Driver appropriate for the current version of your Chrome browser. My Chrome version is 102.0.5005.115 (click on the three dots menu, then select Settings and proceed to About Chrome).?Therefore, I had downloaded Chrome Driver version 102.
6.????Place the downloaded chromedriver.exe file in the folder where your Selenium Basic is. For me it is: C:\Users\vadim\AppData\Local\SeleniumBasic
Check:
The following script should open the main Wikipedia page, maximize the window, and wait for 15 seconds.
Sub Test()
Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Bot.Window.Maximize
Application.Wait Now + TimeValue("00:00:15")
End Sub
Each new script below will build on the previous scripts.?
Example 1: (Automated Wikipedia Search):
We will be searching for an article on machine learning.
Right click on the Wikipedia search window and select Inspect. You should be able to easily locate web element id (“searchInput”)?as shown below.
The following command will enter the expression “machine learning” into the Wikipedia search window.
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Now all that we have left to do is to click the search button. Action buttons can be difficult to work with, therefore, we will take advantage of the full XPath information (Inspect search button, then click Copy and select Copy full XPath):?
Now, here is the entire script for our automated search:
Sub WikiSearch()
Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click
End Sub
Your Sub will take you to the following article:?
Example 2: (Text Analysis):
Using the script below, we can count occurrences of words, expressions, sentences etc. We will count how many times the expression “deep learning” arises in the Wikipedia's machine learning article.?
Sub WordCounter()
Dim Bot As New WebDriver
Dim Count As Long
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click
Count = Bot.FindElementsByXPath("https://*[contains(text(),'deep learning')]").Count
Debug.Print Count
End Sub
Please, note that the algorithm is case-sensitive.
The following adjustment can be used for case-insensitive matching:?
contains(translate(text(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'deep learning')
Example 3: (Extracting Links):
Let’s say we are interested in web links on the Wikipedia's machine learning page that contain the word “TensorFlow”. The following code will write down names of the links in the first column of Excel’ Sheet 1, and extract links in the second column.
Sub ExtractLinks()
Dim Bot As New WebDriver
Dim Count As Long
Dim eleList As New List
Dim ele
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click
Set eleList = Bot.FindElementsByXPath("https://*[contains(text(),'TensorFlow')]")
j = 0
For Each ele In eleList
? ? Sheet1.Cells(j + 1, 1) = ele.Text
? ? Sheet1.Cells(j + 1, 2) = ele.Attribute("href")
? ? j = j + 1
Next ele
Debug.Print j
End Sub
This is how Sheet 1 in Excel should look like now:?
Bonus (LinkedIn automated login):
All that you should change is your email/phone number and your password in the script below.?
Sub LinkedInLogin()
Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.dhirubhai.net/"
Application.Wait Now + TimeValue("00:00:03")
Bot.Window.Maximize
Bot.FindElementById("session_key").SendKeys ("[email protected]")
Bot.FindElementById("session_password").SendKeys ("yourpassword12345678")
Application.Wait Now + TimeValue("00:00:05")
Bot.FindElementByXPath("/html/body/main/section[1]/div/div/form/button").Click
End Sub
Associate Professor of Agricultural Economics at Kansas State University
1 年The instructions are very specific. Thank you. However, I get a Run-time error Automation Error on Bot.Start "chrome". Any idea why? I'm using Chrome Version 118.0.5993.118 (Official Build) (64-bit) and downloaded the appropriate chromedriver.
Great article bro. I will definitely have this as a resource for when I need to do VBA development.