Selenium Basic for Chrome browser
https://www.selenium.dev/

Selenium Basic for Chrome browser

Modern data science demands a high degree of flexibility, knowledge of multiple programming languages such as Python, SAS, and R, as well as, a plethora of skills and tools. One of said skills that I had to learn and use early in my career was Web Scraping. Another skill which will definitely not hurt knowing or utilizing is VBA. This tutorial will guide you through the installation process of Selenium Basic and its usage in tandem with Excel’s VBA and Chrome Browser. You will also be equipped with several real-world examples.

Requirements:

·??????Microsoft Excel

·??????Chrome browser

·??????Basic proficiency in VBA

·??????Basic understanding of web page elements inspection?

Installation:

1.????Install Selenium IDE?for Chrome.

2.????Install Selenium Basic.

3.????Access Visual Basic from the Developer Tab in Excel. If developer tab is not showing, check here.

4.????Add Selenium Type Library in the VBA project references (select References from the Tools menu). Note the Location, we will need it at the last step.?

No alt text provided for this image

5.????Download the latest Chrome Driver appropriate for the current version of your Chrome browser. My Chrome version is 102.0.5005.115 (click on the three dots menu, then select Settings and proceed to About Chrome).?Therefore, I had downloaded Chrome Driver version 102.

6.????Place the downloaded chromedriver.exe file in the folder where your Selenium Basic is. For me it is: C:\Users\vadim\AppData\Local\SeleniumBasic

Check:

The following script should open the main Wikipedia page, maximize the window, and wait for 15 seconds.

Sub Test()
Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Bot.Window.Maximize
Application.Wait Now + TimeValue("00:00:15")
End Sub
        

Each new script below will build on the previous scripts.?

Example 1: (Automated Wikipedia Search):

We will be searching for an article on machine learning.

Right click on the Wikipedia search window and select Inspect. You should be able to easily locate web element id (“searchInput”)?as shown below.

No alt text provided for this image

The following command will enter the expression “machine learning” into the Wikipedia search window.


Bot.FindElementById("searchInput").SendKeys ("machine learning")
        

Now all that we have left to do is to click the search button. Action buttons can be difficult to work with, therefore, we will take advantage of the full XPath information (Inspect search button, then click Copy and select Copy full XPath):?

No alt text provided for this image

Now, here is the entire script for our automated search:


Sub WikiSearch()


Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click


End Sub
        

Your Sub will take you to the following article:?

No alt text provided for this image

Example 2: (Text Analysis):

Using the script below, we can count occurrences of words, expressions, sentences etc. We will count how many times the expression “deep learning” arises in the Wikipedia's machine learning article.?


Sub WordCounter()


Dim Bot As New WebDriver
Dim Count As Long
Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click
Count = Bot.FindElementsByXPath("https://*[contains(text(),'deep learning')]").Count


Debug.Print Count


End Sub
        

Please, note that the algorithm is case-sensitive.

The following adjustment can be used for case-insensitive matching:?


contains(translate(text(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'deep learning')
        

Example 3: (Extracting Links):

Let’s say we are interested in web links on the Wikipedia's machine learning page that contain the word “TensorFlow”. The following code will write down names of the links in the first column of Excel’ Sheet 1, and extract links in the second column.


Sub ExtractLinks()


Dim Bot As New WebDriver
Dim Count As Long
Dim eleList As New List
Dim ele


Bot.Start "chrome"
Bot.Get "https://www.wikipedia.org/"
Application.Wait Now + TimeValue("00:00:05")
Bot.Window.Maximize
Bot.FindElementById("searchInput").SendKeys ("machine learning")
Bot.FindElementByXPath("/html/body/div[3]/form/fieldset/button/i").Click
Set eleList = Bot.FindElementsByXPath("https://*[contains(text(),'TensorFlow')]")


j = 0


For Each ele In eleList
? ? Sheet1.Cells(j + 1, 1) = ele.Text
? ? Sheet1.Cells(j + 1, 2) = ele.Attribute("href")
? ? j = j + 1
Next ele


Debug.Print j


End Sub
        

This is how Sheet 1 in Excel should look like now:?

No alt text provided for this image

Bonus (LinkedIn automated login):

All that you should change is your email/phone number and your password in the script below.?


Sub LinkedInLogin()


Dim Bot As New WebDriver
Bot.Start "chrome"
Bot.Get "https://www.dhirubhai.net/"
Application.Wait Now + TimeValue("00:00:03")
Bot.Window.Maximize
Bot.FindElementById("session_key").SendKeys ("[email protected]")
Bot.FindElementById("session_password").SendKeys ("yourpassword12345678")
Application.Wait Now + TimeValue("00:00:05")
Bot.FindElementByXPath("/html/body/main/section[1]/div/div/form/button").Click


End Sub

        
Brian Coffey

Associate Professor of Agricultural Economics at Kansas State University

1 年

The instructions are very specific. Thank you. However, I get a Run-time error Automation Error on Bot.Start "chrome". Any idea why? I'm using Chrome Version 118.0.5993.118 (Official Build) (64-bit) and downloaded the appropriate chromedriver.

Great article bro. I will definitely have this as a resource for when I need to do VBA development.

要查看或添加评论,请登录

Vadim Tyuryaev的更多文章

  • A Deep Dive into ANOVA(part 3)

    A Deep Dive into ANOVA(part 3)

    In Parts 1 and 2, we engaged in a detailed and methodological discourse on one-way and two-way Analysis of Variance…

  • A Deep Dive into ANOVA(part 2)

    A Deep Dive into ANOVA(part 2)

    In part 1 of the ANOVA series, our discussion encompassed the principles of one-way ANOVA, along with the…

  • A Deep Dive into ANOVA (part 1)

    A Deep Dive into ANOVA (part 1)

    Analysis of Variance (ANOVA) is a statistical method used to assess the equality of means across multiple groups. In…

  • The Importance of Avoiding Data Snooping and the Vast Search Effect in Data Science

    The Importance of Avoiding Data Snooping and the Vast Search Effect in Data Science

    In the dynamic landscape of data analysis, researchers and analysts often find themselves grappling with the challenges…

  • MS VBA to reorder columns in Excel

    MS VBA to reorder columns in Excel

    INTRODUCTION Recently, I encountered a challenge when using SharePoint surveys. It came to my attention that SharePoint…

  • Versioning Large Files with Git LFS

    Versioning Large Files with Git LFS

    Git is a popular version control system used for managing code repositories. However, one limitation of Git is that it…

  • GPU version of TensorFlow? for R

    GPU version of TensorFlow? for R

    Modern statistical and machine learning (ML) algorithms require fast, reliable and efficient computations. The very…

社区洞察