Macros for simple website automation

Macros for simple website automation

This article is meant for beginners who want to learn Macro automation in the lock-down period. Macros are VB programs written within the Excel application that can help automate repetitive tasks. 

To enable Development mode to write Macros. In Excel go to File-> Options-> Customize Ribbon-> Check “Developer” Tab

No alt text provided for this image

To trigger the VB Script using a button. In the Developer tab, Drag and drop the Command Button(ActiveX Control) using the Insert option. 

No alt text provided for this image

Drag and drop the Command Button to the spreadsheet and double-click it to open the VB Editor. 

No alt text provided for this image

The code between Private Sub CommandButton1_Click() and End Sub will be triggered on click of the Command Button. The file should be saved as a Macro-Enabled Workbook. 

To open Internet Explorer through VB:

 Dim objIE As Object
 Set objIE = CreateObject("InternetExplorer.Application")
 objIE.Visible = True

This Opens the a new IE window and assigns its control to the Object objIE. It is better to use 

Set objIE As InternetExplorerMedium
Set objIE=New InternetExplorerMedium 

Instead of  CreateObject("InternetExplorer.Application"). Using InternetExplorerMedium allows the VB Compiler to allocate memory in advance and make other important optimizations(early Binding).

To enable the InternetExplorerMedium object, select Microsoft Internet Controls from Tool -> References in the VB Editor. Also select Microsoft Excel 16.0 Object Library

No alt text provided for this image

Now, we will see how VB can be used to collect information about products from the e-commerce website MuFubu.com 

To make the IE Window to go the website MuFuBu.com, we can use the command.

objIE.Navigate "https://mufubu.com/collections/backpacks?page=1"

To find out how many pages need to be visited, we can look at the page navigation bar. 

No alt text provided for this image

To get the value of totalItems, we can use the DOM to spy the webpage using Chrome(F12)

No alt text provided for this image

Ctrl+F in the DOM page with the search value //span[@class=”count”] returns a unique value. So this path can be used to identify the navigation bar 

No alt text provided for this image

We can find the text in the Navigation bar using the function getElementsByClassName

Set navigationTextObject = objIE.document.getElementsByClassName("count")(0)

The only element in the navigationTextObject is the navigation Bar. The text in the element in navigationTextObject will give “Showing Items **-** of **” . We can get the Total Items by getting the substring of the inner text after “of”. The Split function returns an Array with elements before and after the string "of". Cint typecasts the String to an Integer. 

totalItems = CInt(Split(navigationTextObject.innerText, "of")(1))

If totalItems is a whole multiple of 20, the number of pages will be totalItems/20. Else it will be totalItems/20+1

 If totalItems Mod 20 = 0 Then
       maxPage = totalItems / 20
 Else
       maxPage = totalItems / 20 + 1
 End If

By inspecting the list structure of the products displayed in the page, we can see that the product information is contained in a div object with class “product-info”. There are 20 items with the class “product-info” in the first page. 

No alt text provided for this image

By comparing the details on the page with HTML, we can identify the corresponding elements.

No alt text provided for this image
No alt text provided for this image

To get the first object with class “product-info”, we use the command

 Set productInfo = objIE.document.getElementsByClassName("product-info")(0)

 From this productInfo html object, we can get the html link with detailed product details by

linkAddress = productInfo.getElementsByTagName("a")(0).href

Similarly, we can get

description = productInfo.getElementsByTagName("a")(0).innerText

offeredPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("onsale")(0) .getElementsByTagName("span")(0).innerText


 actualPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("was")(0).getElementsByTagName("span")(0).innerText

We use a For Loop to iterate over the pages and an inner For Loop to iterate over each Item

There are 20 items on all pages except the last page. If the total number of items is a multiple of 20, the last page has 20 items. Otherwise, the number of items on the last page is the reminder when the total number of Items is divided by 20. Writing that as a For loop we have. 

   For pageNo = 1 To maxPage
        If pageNo = maxPage Then
            If totalItems Mod 20 = 0 Then
                maxItem = 20
            Else
                maxItem = totalItems Mod 20
            End If
        Else
            maxItem = 20
        End If

    Next

  After fetching the current page details, we need to navigate to the next page. We can do so by clicking on the next button. 

No alt text provided for this image

The inspecting the page Html, we can see that the Next button is inside a pagination div with a unique id.

No alt text provided for this image

So we can visit the next page by using the line 

objIE.document.getElementById("Pagination").getElementsByTagName("a")(2).Click

The next button directs to the url of the next page. Since the url is already contained in the page url, we can avoid clicking on the link and replace the page number in the url with the page number variable. 

No alt text provided for this image

Therefore, instead of clicking on the link, we can use

pageURL = "https://mufubu.com/collections/backpacks?page="

objIE.Navigate pageURL&CStr(pageNo)  

CStr is used to typecast pageNo as a String variable. To make the Vbscript wait for the page to load after navigating, we can use 

Do Until objIE.readyState = 4: DoEvents: Loop

This waits until ready State =4, ie until the browser is ready to accept input.

To set the cell values of Current workbook(Which the Macro belongs to), use Application.ThisWorkbook. To set the headings, we can use, 

    rowNumber=1

    With Application.ThisWorkbook.Worksheets("Sheet2")

        .Cells(rowNumber, 1).Value = "Sl No"

        .Cells(rowNumber, 2).Value = "Description"

        .Cells(rowNumber, 3).Value = "Offered Price"

        .Cells(rowNumber, 4).Value = "Actual Price"

        .Cells(rowNumber, 5).Value = "Url"

    End With

The With function helps us reduce the number of times we refer to the WorkBook and sheet. Inside the With- End With container, Application.ThisWorkbook.Worksheets("Sheet2"). Cells(rowNumber, 1).Value = "Sl No" can be replaced with .Cells(rowNumber, 1).Value = "Sl No"

To iterate through each item in the page and get the corresponding product information we can use an inner loop. By incrementing rowNumber every time we obtain product details, we can append the product details in a continuous list.

        For itemNumber = 0 To (maxItem - 1)

            rowNumber = rowNumber + 1

            Set productInfo = objIE.document.getElementsByClassName("product-info")(itemNumber)

            linkAddress = productInfo.getElementsByTagName("a")(0).href

            description = productInfo.getElementsByTagName("a")(0).innerText

            offeredPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("onsale")(0).getElementsByTagName("span")(0).innerText

            actualPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("was")(0).getElementsByTagName("span")(0).innerText

            With Application.ThisWorkbook.Worksheets("Sheet2")

                .Cells(rowNumber, 1).Value = rowNumber - 1

                .Cells(rowNumber, 2).Value = description

                .Cells(rowNumber, 3).Value = offeredPrice

                .Cells(rowNumber, 4).Value = actualPrice

                .Cells(rowNumber, 5).Value = linkAddress

            End With

        Next

Clicking the execute button gives the list of 45 product details in Sheet2

No alt text provided for this image

Please find the entire Code below

Private Sub CommandButton1_Click()
    Dim objIE As InternetExplorerMedium
    Set objIE = Nothing
    Dim totalItems As Integer, maxPage As Integer, pageNo, itemNumber, linkAddress, offeredPrice, actualPrice, description, maxItem, rowNumber, pageURL, navigationTextObject, productInfo
    pageURL = "https://mufubu.com/collections/backpacks?page="

    Set objIE = New InternetExplorerMedium
    objIE.Visible = True
    objIE.Navigate pageURL & CStr(1)
    Do Until objIE.readyState = 4: DoEvents: Loop

    Set navigationTextObject = objIE.document.getElementsByClassName("count")(0)
    totalItems = CInt(Split(navigationTextObject.innerText, "of")(1))
    
    rowNumber = 1
    With Application.ThisWorkbook.Worksheets("Sheet2")
        .Cells(rowNumber, 1).Value = "Sl No"
        .Cells(rowNumber, 2).Value = "Description"
        .Cells(rowNumber, 3).Value = "Offered Price"
        .Cells(rowNumber, 4).Value = "Actual Price"
        .Cells(rowNumber, 5).Value = "Url"
    End With
    
    If totalItems Mod 20 = 0 Then
        maxPage = totalItems / 20
    Else
        maxPage = totalItems / 20 + 1
    End If
    
    For pageNo = 1 To maxPage
        objIE.Navigate pageURL & CStr(pageNo)
        Do Until objIE.readyState = 4: DoEvents: Loop
        
        If pageNo = maxPage Then
            If totalItems Mod 20 = 0 Then
                maxItem = 20
            Else
                maxItem = totalItems Mod 20
            End If
        Else
            maxItem = 20
        End If
        
        For itemNumber = 0 To (maxItem - 1)
            rowNumber = rowNumber + 1
            Set productInfo = objIE.document.getElementsByClassName("product-info")(itemNumber)
            linkAddress = productInfo.getElementsByTagName("a")(0).href
            description = productInfo.getElementsByTagName("a")(0).innerText
            offeredPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("onsale")(0).getElementsByTagName("span")(0).innerText
            actualPrice = productInfo.getElementsByClassName("price")(0).getElementsByClassName("was")(0).getElementsByTagName("span")(0).innerText
            With Application.ThisWorkbook.Worksheets("Sheet2")
                .Cells(rowNumber, 1).Value = rowNumber - 1
                .Cells(rowNumber, 2).Value = description
                .Cells(rowNumber, 3).Value = offeredPrice
                .Cells(rowNumber, 4).Value = actualPrice
                .Cells(rowNumber, 5).Value = linkAddress
            End With
        Next
    Next
    
    objIE.Quit
    
    Set objIE = Nothing
End Sub

Edit: The mufubu site is currently down as they are only delivering essentials(not because of too many people running VBA on their site). I have written the below code to fetch data from saucedemo.com (site meant for testing automation). You can test VBA on saucedemo.

Private Sub CommandButton1_Click()
    Dim objIE As InternetExplorerMedium
    Set objIE = Nothing
    Dim loginButtonObj, inventoryItems, nameElement, pageURL As String, rowNumber As Integer
    pageURL = "https://www.saucedemo.com/index.html"
    Dim objWSS
    Set objIE = New InternetExplorerMedium
    objIE.Visible = True
    objIE.Navigate pageURL
    Do Until objIE.readyState = 4: DoEvents: Loop
    Set objWSS = CreateObject("WScript.Shell")
    Application.Wait (Now + TimeValue("00:00:04"))
    Set nameElement = objIE.Document.getElementById("user-name")
    nameElement.Click
    nameElement.Focus
    objWSS.SendKeys "standard_user"
    objWSS.SendKeys "{TAB}"
    objWSS.SendKeys "secret_sauce"
    Application.Wait (Now + TimeValue("00:00:04"))
    'objIE.Document.getElementById("user-name").innerText = "value"
    Set loginButtonObj = objIE.Document.getElementsByClassName("btn_action")
    For Each Elem In loginButtonObj
        If Elem.Value = "LOGIN" Then
            Elem.Click
            Do Until objIE.readyState = 4: DoEvents: Loop
            Application.Wait (Now + TimeValue("00:00:05"))
        End If
    Next Elem
    Set inventoryItems = objIE.Document.getElementsByClassName("inventory_item")
    rowNumber = 1
    With Application.ThisWorkbook.Worksheets("Sheet2")
        .Cells(rowNumber, 1).Value = "Sl No"
        .Cells(rowNumber, 2).Value = "Item Name"
        .Cells(rowNumber, 3).Value = "Description"
        .Cells(rowNumber, 4).Value = "Price"
        rowNumber = rowNumber + 1
        For Each Item In inventoryItems
            .Cells(rowNumber, 1).Value = rowNumber - 1
            .Cells(rowNumber, 2).Value = Item.getElementsByClassName("inventory_item_name")(0).innerText
            .Cells(rowNumber, 3).Value = Item.getElementsByClassName("inventory_item_desc")(0).innerText
            .Cells(rowNumber, 4).Value = Item.getElementsByClassName("inventory_item_price")(0).innerText
            rowNumber = rowNumber + 1
        Next Item
    End With
End Sub

No alt text provided for this image

The login page at saucedemo.com had a peculiar property where the username and password set through getElementById("").Value and getElementById("").InnerText were cleared when clicking the login button. This might be due to javascript functions called onchange and onclick of input fields. So I entered the values as keystroker using sendKeys. It is otherwise not a good pratice to use sendkeys. getElementById("").fireEvent("onchange") might solve the problem and should be tried before going ahead with sendkeys.

In case of 462 Internet Explorer error, go to IE Settings -> Internet Options -> Security -> Uncheck Enable Protected Mode

No alt text provided for this image


Gururaj Kulkarni

Visionary AI Leader | Senior Manager | Expert in Generative AI, Automation, and Digital Transformation

4 年

Amazing article... looking forward to seeing some more from you!

要查看或添加评论,请登录

Sachdeep Sivakumar的更多文章

社区洞察

其他会员也浏览了