Data Analytics and manipulation – Part 2
Junaid Ahmed
5x profit | C-suite | xPwc | 2 ERP implementation | Financial Modelling | Power BI | Advanced Excel | Head of Finance | Forecasting | Budgeting | Training | Python basic
Welcome back to the second part of data analytics and manipulation, which will accelerate your journey in this future world! In last part, we learned to bring random data from product table to our dataset and in this practical approach we learned the use of following funtions:
Let's learn and move forward from where we left. Our data looked like below:
Bring Type of product using XLOOKUP (Exact match)
In order to stop our product name refresh each time, let's fix it by copy/paste values in column B. Select Column B, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter. This will convert your data into fixed values.
XLOOKUP has certain advantages over other lookup functions (VLOOKUP, HLOOKUP, LOOKUP, INDEX/MATCH). Find 5 most relevant advantages below:
As you have seen, how powerful this XLOOKUP function is and how it eliminates various complex structures that were previously being used to get required results. Let's apply this in our data:
=XLOOKUP(B2,ProductTable[Product Name],ProductTable[Product Type])
Syntax:
XLOOKUP(lookup value, lookup array, return array, if not found, match mode, search mode)
B2 is the product name column that we want to lookup in our product table to bring product type
ProductTable[Product Name] is the column where our unique product names are appearing.
ProductTable[Product Type] is the column where product type is appearing and that's what we want xlookup function to return in our dataset.
Just copy paste formula in B2 till B101. That's exactly what we wanted for our data.
Bonus - More uses of XLOOKUP
Generating Random Sales Rep Names using RandArray, Index, CountA
Now come back to our data and bring random names of Sales Person using same technique that we used in our part 1.
=INDEX(
SaleRepTable[Sales Person],
RANDARRAY(100,1,1,
COUNTA(SaleRepTable[Sales Person]),
TRUE)
)
Above formula will fill 100 rows with random sales person names extracted from SaleRepTable. Now again prevent sales person name being refreshed by paste special value in column D. So repeat what we have learned above: Select Column D, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter.
Bring Region and Locality of Sales Rep using Vlookup and Table Columns
You might be aware of using vlookup function and bring data from normal excel datasheet. Have you ever tried bringing data from Table using Table column names? Let's try now and type formula in column E as follows:
=VLOOKUP(D2:D101,SaleRepTable,2,FALSE)
Syntax:
VLOOKUP(lookup value, table array, column index, range lookup)
Above formula searches all data in rows D2 to D101 in one go and brings back result from SaleRepTable from column1. It is important to note that first column of SaleRepTable contains values that we are trying to search from.
What if we want to search from some middle column of table and want to bring value from next column or any column on the right. Let's try it in case of locality and type following formula:
=VLOOKUP(E2,SaleRepTable[TRegion]:SaleRepTable[TLocality],2,FALSE)
Now in above example, we have not given name of table but we have selected range of columns starting from TRegion to TLocality. We wanted to search column E "Region" values and for vlookup that should be our first column. Therfore, we have started our table array argument from TRegion column of SalesRepTable.
Now copy paste above formula till E101 row.
Generate Random sales dates using RANDBETWEEN
Type 01/01/23 in column T1 and 31/12/23 in column T2
Now in column G2 type formula:
=RANDBETWEEN($T$1,$T$2)
Syntax:
RANDBETWEEN(botton, top)
$T$1 defines the lowest date that we want to use as sales date in our table
$T$2 defines the highest date that we want to use
Above will generate random sales date for the data. If your data is reflecting incorrect date format, leave it for now and we will correct it in data formatting part. Now copy paste fromula till G101.
Difference between RANDARRAY and RANDBETWEEN is that RANDBETWEEN generates random number in one cell and RANDARRAY generates random number in multiple cells and has capability to generate numbers in decimal also.
Now Let's have a look, how data appears now:
Now convert Column G into fixed values by using: Select Column G, press Ctrl + C, Press Ctrl + Alt + V, Press V, Press Enter. We will generate random sales quantity by using following formula in H2:
=RANDBETWEEN(1,10)
Now copy/paste the formula till H101 and convert it into fixed values.
Bring rates from Product Table using INDEX + MATCH
We can use XLOOKUP and VLOOKUP to bring rates. However, our objective is to learn new and alternate data anlaytics techniques. So let's use index with match. As we have already learned that index brings row number and exploiting this technique with match function, we can bring rates of products. Let's explore this:
=INDEX(ProductTable[Rates],
MATCH(B2,ProductTable[Product Name])
)
With match function, we want to match B2 value in Product Name column of Product Table. This brings us the row number in which such value matches.
We can use the number generated by match function as second argument in index function. This will direct INDEX function to call data from the row of selected column and we will dodge here by selecting Rate column. Now copy paste data from I2 to I101.
Complete out data for anlaysis
Now in column J multiply column H and I: =H2*I2 and finalize our data manipulation. In next article we will find answers to few questions in our data using various formulas used for data analytics. Let's see what our final file looks now:
Stay connected and improve your skills for accelerated career and growth in your Job!