Gideon's Excel Tips, no. 2: Picking a random sample from a list
If you have a list of items and need to randomly choose a certain number of them, there are many ways to do that. Here is the simplest way to do it in Excel.
Let’s say I have been asked to conduct a survey of the company’s employees. But I have to do it all on my own and, rather than surveying all 92 of them, I have decided to choose just 25% of them, or 23, at random.
The principle of the technique is very simple: sort the names of the employees in random order and use the top 23 people from the re-sorted list as my sample. Here’s what we do, in steps.
1. Record the original order of the names, so we can return to it if we need to
Next to my list of names add a new column called Original order. Enter 1 and 2 as the top two entries and use autofill (double-click on the autofill handle) to number all the names in the list.
2. Add a new column containing random numbers
Add the RAND() function in a new column next to the names. Use autofill again to add a random number next to each name.
3. Freeze the random numbers
The problem with the random number formulae is that they change every time Excel recalculates. This is a minor point really, and not strictly necessary to pick our random sample, but I’d like to stick with one set of random numbers for my sort order so that, if anyone asks how the names were selected, I can show them that particular set of random numbers. So I’ll use Copy, Paste Values to remove the formulae, leaving just the values of the random numbers. That way they won’t keep changing.
Select all of the random number formulae in column C. Copy them (e.g. by typing Ctrl+c). Then select Paste Special with the Values option. In the animated GIF below I use the main Paste Special dialog box rather than using one of the shortcut buttons in the dropdown.
Note that after we have done that cell C4 just contains a number. The =RAND() formula has gone.
4. Sort the data using the random numbers
5. Use the top 23 people in the re-sorted list as your sample
There are many refinements of this idea, which include using LARGE and MATCH functions together as a dynamic alternative to manually sorting the list. If you want to be really fancy you can use a multi-cell array formula to do the whole thing! But the rough-and-ready method described here is the easiest and quickest that I use.
Excel Masterclasses for aspiring power users
We are launching a range of Masterclasses to fill the demand for Excel courses that are at the top end of the skill range. The courses are intended for Management Consultants and other analysts and managers who need to become true Excel power users.
The full range of courses is outlined here. The first course to be launched is a two-day course on the fundamental subject of Formulae. For more information or to book take a look here.
The courses will be run using webcasting technology in real-time and with small groups, so there will be plenty of interaction and lots of practical exercises. For the first course we will be using UK timings, starting at 9AM UK local time (but you are welcome to join from outside the UK) and the first course is scheduled for 13-14 June 2019.