Data Analytics and manipulation – Part 2

Data Analytics and manipulation – Part 2

Welcome back to the second part of data analytics and manipulation, which will accelerate your journey in this future world! In last part, we learned to bring random data from product table to our dataset and in this practical approach we learned the use of following funtions:

  • COUNTA
  • INDEX
  • RANDARRAY
  • SEQUENCE

Let's learn and move forward from where we left. Our data looked like below:

Bring Type of product using XLOOKUP (Exact match)

In order to stop our product name refresh each time, let's fix it by copy/paste values in column B. Select Column B, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter. This will convert your data into fixed values.

XLOOKUP has certain advantages over other lookup functions (VLOOKUP, HLOOKUP, LOOKUP, INDEX/MATCH). Find 5 most relevant advantages below:

As you have seen, how powerful this XLOOKUP function is and how it eliminates various complex structures that were previously being used to get required results. Let's apply this in our data:

=XLOOKUP(B2,ProductTable[Product Name],ProductTable[Product Type])

Syntax:

XLOOKUP(lookup value, lookup array, return array, if not found, match mode, search mode)

B2 is the product name column that we want to lookup in our product table to bring product type

ProductTable[Product Name] is the column where our unique product names are appearing.

ProductTable[Product Type] is the column where product type is appearing and that's what we want xlookup function to return in our dataset.

Just copy paste formula in B2 till B101. That's exactly what we wanted for our data.


Bonus - More uses of XLOOKUP


  1. Formula used in B2 finds USA in data from top to bottom and returns New York because we have used 1 in the end.
  2. In B3, when we changed last 1 into -1 which starts searching from bottom to top and brings Washington as its last value
  3. In B4, we asked xlookup to search value that starts from U but could be anything after U. This brought Kyiv as a result because first word that start with "U" in the list is Ukraine and Kyiv is the city written next to it. In order to achieve it, we changed 2nd last argument of formula from 0 to 2.
  4. In B5, we used same formula to see how dynamically this formula changes results. Now try to change A5 cell to I and it will bring result "New Dehli"
  5. In B6, we used formula used in B2, however we changed A6 value to France which is not present in Data and its bringing value that we wanted to present in case data is not found.
  6. In B9, we are searching country based on city name. However, city names are not right side, while country names are on left side. Vlookup cannot perform reverse lookup and previously data analyst used INDEX + MATCH to lookup on the left side of data. But powerful function of XLOOKUP can perform this task easily.


Generating Random Sales Rep Names using RandArray, Index, CountA

Now come back to our data and bring random names of Sales Person using same technique that we used in our part 1.

=INDEX(

SaleRepTable[Sales Person],

RANDARRAY(100,1,1,

COUNTA(SaleRepTable[Sales Person]),

TRUE)

)

Above formula will fill 100 rows with random sales person names extracted from SaleRepTable. Now again prevent sales person name being refreshed by paste special value in column D. So repeat what we have learned above: Select Column D, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter.

Bring Region and Locality of Sales Rep using Vlookup and Table Columns

You might be aware of using vlookup function and bring data from normal excel datasheet. Have you ever tried bringing data from Table using Table column names? Let's try now and type formula in column E as follows:

=VLOOKUP(D2:D101,SaleRepTable,2,FALSE)

Syntax:

VLOOKUP(lookup value, table array, column index, range lookup)

Above formula searches all data in rows D2 to D101 in one go and brings back result from SaleRepTable from column1. It is important to note that first column of SaleRepTable contains values that we are trying to search from.

What if we want to search from some middle column of table and want to bring value from next column or any column on the right. Let's try it in case of locality and type following formula:

=VLOOKUP(E2,SaleRepTable[TRegion]:SaleRepTable[TLocality],2,FALSE)

Now in above example, we have not given name of table but we have selected range of columns starting from TRegion to TLocality. We wanted to search column E "Region" values and for vlookup that should be our first column. Therfore, we have started our table array argument from TRegion column of SalesRepTable.

Now copy paste above formula till E101 row.

Generate Random sales dates using RANDBETWEEN

Type 01/01/23 in column T1 and 31/12/23 in column T2

Now in column G2 type formula:

=RANDBETWEEN($T$1,$T$2)

Syntax:

RANDBETWEEN(botton, top)

$T$1 defines the lowest date that we want to use as sales date in our table

$T$2 defines the highest date that we want to use

Above will generate random sales date for the data. If your data is reflecting incorrect date format, leave it for now and we will correct it in data formatting part. Now copy paste fromula till G101.

Difference between RANDARRAY and RANDBETWEEN is that RANDBETWEEN generates random number in one cell and RANDARRAY generates random number in multiple cells and has capability to generate numbers in decimal also.

Now Let's have a look, how data appears now:

Now convert Column G into fixed values by using: Select Column G, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter. We will generate random sales quantity by using following formula in H2:

=RANDBETWEEN(1,10)

Now copy/paste the formula till H101 and convert it into fixed values.

Bring rates from Product Table using INDEX + MATCH

We can use XLOOKUP and VLOOKUP to bring rates. However, our objective is to learn new and alternate data anlaytics techniques. So let's use index with match. As we have already learned that index brings row number and exploiting this technique with match function, we can bring rates of products. Let's explore this:

=INDEX(ProductTable[Rates],

MATCH(B2,ProductTable[Product Name])

)

With match function, we want to match B2 value in Product Name column of Product Table. This brings us the row number in which such value matches.

We can use the number generated by match function as second argument in index function. This will direct INDEX function to call data from the row of selected column and we will dodge here by selecting Rate column. Now copy paste data from I2 to I101.

Complete out data for anlaysis

Now in column J multiply column H and I: =H2*I2 and finalize our data manipulation. In next article we will find answers to few questions in our data using various formulas used for data analytics. Let's see what our final file looks now:

Stay connected and improve your skills for accelerated career and growth in your Job!


要查看或添加评论,请登录

Junaid Ahmed的更多文章

  • Data Analytics - Finding answers in data Part 1

    Data Analytics - Finding answers in data Part 1

    Hello Valued Readers and learning enthusiasts! In our last two articles (Part 1 and Part 2 of Data Analytics and…

    2 条评论
  • Data Analytics and manipulation – Part 1

    Data Analytics and manipulation – Part 1

    Data Analytics and manipulation – Part 1 In this tutorial, let’s learn how we can create various data types using…

    1 条评论
  • Data Analytic - How to

    Data Analytic - How to

    What is Data Analytics and its importance Anything that leads to easy decision making is data analytics. Data-driven…

社区洞察

其他会员也浏览了