Your Function Should Be Smaller Than That
Thirugnanasambandar Kuppusamy
Principal Member Of Technical Staff at athenahealth
The first rule of the function is that they should be smaller. "Clean Code" book by Robert C Martin says that "Your function should be smaller than that". But how much small a function should be or what is the correct size of a function, is always a confusion and I had often seen developers frowning upon this question. As part of this document, I would list down why function should be smaller and the guidelines to decide whether function is in right size or should be broke down further.
Function should do only one thing it is supposed to do
Function should do only what its name suggest it to do. Function name and logic inside the function should be very tightly coupled. There is 2 aspects here: Function name should clearly explain the logic handled inside the function Function should follow "Single Responsibility" principle The purpose of the function should be very explicit from the name and the logic inside the function should not do more than what the name suggests. There should not be any hidden additional logic inside the function. Hidden logic always hinders readability and also maintainability. In general all the functions should follow the "Single responsibility" principle. Single responsibility principle is the first principle in S.O.L.I.D design principles. Some of the advantages of using single responsibility principles are:
Functions should do one thing. It should do it well. They should do it only (Ref. Clean Code by Robert C. Martin)
If you try to explain your function logic in brief words and if your explanation is "function does this and that" then "this" has to be a separate function and "that" has to be a separate function.
Example: FetchData() method doing both fetching and processing.
Title: Cluttered code Code Language: Perl
sub FetchStudentsData {
my @StudentsDataSet = //SQL Query to retrieve the data;
foreach my $data(@StudentsDataSet) {
$data->{DOB} = strftime("%M%DY", $data->{DOB});
$data->{FullName} = $data->{FirstName}." ".$data->{LastName};
$data->{Age} = GetDifferentInYears($data->{DOB}, Today());
}
return \@StudentDataSet;
}
Here FetchData is doing 2 actions - Fetching and Processing. This function will has to be tested extensively when either query changes or processing changes. This is very small function so readability is not an issue here but readability is a big problem when the function is big and incorporates complex logic.
Small functions doing multiple things might not have readability issue but it will definitely impact scalability/evolution.
Title: Refactored code Code Language: Perl
sub GetProcessedStudentsData {
my @StudentsDataSet = FetchStudentsDataFromDB();
ProcessStudentsData(\@PatietsDataSet);
return \@StudentsDataSet;
}
sub FetchStudentsDataFromDB {
@StudentsDataSet = //SQL Query to retrieve the data;
return @StudentsDataSet;
}
sub ProcessStudentData {
my ($StudentsDataSet) = @_;
foreach my $patientdata (@$StudentsDataSet) {
FormatDOB($patientdata);
SetFullName($patientdata);
SetAge($patientdata);
}
}
sub FormatDOB {
$patient->{DOB} = strftime("%M%DY", $patient->{DOB});
}
Should be a Maximum 2 levels of Indentation
Function should contain only maximum 2 levels of code Indentation. It would be great, if the code indentation level is 1. But most of the time a simple "if" comparison or "for" loop pushes it to 2 levels which cannot be further avoided. If there is more than 2 level of Indentation then move the 2nd level indentation logic to separate function. This helps gives us below benefits:
Should be less than 10 lines
Sometimes even when the code indentation is 1 level or 2 levels, we could end up having a bigger function. Code is never easy to read when the function has more lines as the reader has to remember lots of code to understand the complete logic. Try to group related lines or related functions into a separate function, so when the developer reads the code from top-down, he doesn't have read through vast lines of code and instead he can read through the function names to understand logic. In other words, create abstraction by moving related logics to a separate function. When the reader wants to look at the details of any function, then he can deep dive into that function code but most of the time high level information provided by the function name should be suffice. So enable easy readability by using smaller functions.
In the previous example of GetProcessedPatientsData(), reader has to go through only 2-3 lines to understand the logic when he skims through the code. If he decides to understand the processing then he can deep dive into ProcessPatientData(). But with FetchPatientsData(), he has go through 7-8 lines of code to understand the logic. This is a small function, so it will not be an issue to go through 7-8 lines but when function is bigger then it will definitely hinder readability.
Function should have minimum number of arguments
Argument is at a different level of abstraction than the function name and it forces the reader to know the detail at the point which is not needed. Readers have to interpret the arguments each time they read it. Also it's difficult to understand which argument is input or output unless a strict nomenclature is followed around naming the variables. Ideal number of arguments for a function is zero but its not possible most of the time, so developer should try their level best to keep it very minimum.
Some of the suggestions to reduce or unnecessary arguments are:
领英推荐
No Nested If and Loops
Nested loop and nested if statements are always BIG NO. Some of the reasons are avoiding them are:
Second level loop can be always moved to a separate function.
Example: One of famous nested loop implementation is Bubble sort. Usual implementation of bubble sort looks like below.
Title: Cluttered code Code Language: Perl
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)
for (j = 0; j < n-i-1; j++)
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
Let us refactor the bubble sort function to remove nested loop.
Title: Refactored code Code Language: Perl
void bubbleSort(int intArr[], int len)
{
int cur;
for (i = 0; i < len-1; i++)
fixLastMinusIElement(intArr, len-i);
}
void fixLastMinusIElement(int intArr[], int len)
{
int j;
for (j = 0; j < len-1; j++)
sortCurAndNextElement(arr, j)
}
void sortCurAndNextElement(int arr[], int curInd)
{
if (arr[curInd] > arr[curInd+1])
swap(&arr[curInd], &arr[curInd+1]);
}; }
Code should be Unit testable
Modularise the logic into a function such that all the logic inside the function is easily unit testable. In most of the cases, unit tests are given less importance and only very few cases are tested. Developers always tends to catch the remaining scenarios in later testing phases. This habit eventually affects both code quality and also project timeline due to endless debugging. Many hours of debugging can be avoided if functions are unit tested properly with positive, negative and edge cases. Unit test should always have 100% coverage. Though 100% test coverage is a much needed, it alone cannot ensure the code is working fine. Sometimes a simple positive case itself could give 100% coverage when the function is smaller. Developers have to understand the need and advantages of unit testing and do a comprehensive testing by including positive, negative and edge cases to uncover the issues.
In the refactored code, each functionality can be unit tested easily. Below are the benefits of writing a code considering the testability aspect:
Example: Below code identifies the fruits which are present more than 1.
Title: Cluttered code Code Language: Perl
my @fruits= (ban, ban, ban , apple, cherry, apple);
sub GetAndPrintDuplicateItems {
my %fruitscount;
foreach my $fruit(@fruits) {
$fruitscount{$fruit}++;
}
my @dups = grep { $fruitscount{$_} > 1 } keys %fruitscount;
print @dups;
}
In the above example, it's not easy to the foreach and grep statements as they are coupled. A better refactored code will look like below:
Title: Refactored code Code Language: Perl
my @fruits= (ban, ban, ban , apple, cherry, apple);
sub GetCountOfUniqItem {
foreach my $fruit(@fruits) {
$fruitscount{$fruit}++;
}
return \%fruitscount;
}
sub GetItemsWithCountGrtrThanOne {
my ($fruitscount) = @_;
return grep { $fruitscount->{$_} > 1 } keys %fruitscount;
}
sub GetDuplicateItems {
my $fruitscount = GetCountOfUniqItem();
return GetItemsWithCountGrtrThanOne($fruitscount);
}
sub PrintItems {
my ($items) = @_;
print $items;
}
my $dupitems = GetDuplicateItems();
PrintItems($items)
Final thoughts:
Creating smaller functions might initially looks difficult as we would have been used to creating bigger functions. But it can be easily achieved with practice and even it improves the thought process to come up with logic of a problem in longer run. Developers might feel like creating smaller functions would make extrapolated number of functions. Benefits reaped for maintaining smaller functions always outweighs the issue of having more functions. Overall summary of benefits of smaller functions:
Technical Lead at Tech Mahindra
5 个月Excellent content. However, in certain sections, the code seems to be written in C, though it's labeled as Perl.
Sr. Architect |Executive Director Technology| Cloud Engineer | web specialist | Hands on technologist |
5 个月Good one Thiru. When dealing with a new type of feature, its very natural for a developer to focus on getting things right first and ending up with a 200 line function. Once the unknowns are figured out, the most important thing to do is to dedicate the time to refactor the code to make it readable and maintainable . Your tips here give a lot of insights on how to do just that.