Your Function Should Be Smaller Than That

Your Function Should Be Smaller Than That

The first rule of the function is that they should be smaller. "Clean Code" book by Robert C Martin says that "Your function should be smaller than that". But how much small a function should be or what is the correct size of a function, is always a confusion and I had often seen developers frowning upon this question. As part of this document, I would list down why function should be smaller and the guidelines to decide whether function is in right size or should be broke down further.

Function should do only one thing it is supposed to do

Function should do only what its name suggest it to do. Function name and logic inside the function should be very tightly coupled. There is 2 aspects here: Function name should clearly explain the logic handled inside the function Function should follow "Single Responsibility" principle The purpose of the function should be very explicit from the name and the logic inside the function should not do more than what the name suggests. There should not be any hidden additional logic inside the function. Hidden logic always hinders readability and also maintainability. In general all the functions should follow the "Single responsibility" principle. Single responsibility principle is the first principle in S.O.L.I.D design principles. Some of the advantages of using single responsibility principles are:

  • Easy to read/understand
  • Avoids unnecessary/frequent changes
  • Reduces the impact of changes to other parts of the program.
  • Reduces coupling between components which helps to create evolvable code
  • Easy to identify the bugs or bugs would reveal itself.
  • Easy to scale
  • Avoids code duplication

Functions should do one thing. It should do it well. They should do it only (Ref. Clean Code by Robert C. Martin)
If you try to explain your function logic in brief words and if your explanation is "function does this and that" then "this" has to be a separate function and "that" has to be a separate function.


Example: FetchData() method doing both fetching and processing.

Title: Cluttered code Code Language: Perl

sub FetchStudentsData {
    my @StudentsDataSet = //SQL Query to retrieve the data;
 
    foreach my $data(@StudentsDataSet) {
        $data->{DOB} = strftime("%M%DY", $data->{DOB});
        $data->{FullName} = $data->{FirstName}." ".$data->{LastName};
        $data->{Age} = GetDifferentInYears($data->{DOB}, Today());
    }
 
    return \@StudentDataSet;
}        

Here FetchData is doing 2 actions - Fetching and Processing. This function will has to be tested extensively when either query changes or processing changes. This is very small function so readability is not an issue here but readability is a big problem when the function is big and incorporates complex logic.

Small functions doing multiple things might not have readability issue but it will definitely impact scalability/evolution.

Title: Refactored code Code Language: Perl

sub GetProcessedStudentsData {
    my @StudentsDataSet = FetchStudentsDataFromDB();
 
    ProcessStudentsData(\@PatietsDataSet);
 
    return \@StudentsDataSet;
}
 
sub FetchStudentsDataFromDB {
    @StudentsDataSet = //SQL Query to retrieve the data;
    
    return @StudentsDataSet;
}
 
sub ProcessStudentData {
    my ($StudentsDataSet) = @_;
 
    foreach my $patientdata (@$StudentsDataSet) {
        FormatDOB($patientdata);
        SetFullName($patientdata);
        SetAge($patientdata);
    }
}
 
sub FormatDOB {
     $patient->{DOB} = strftime("%M%DY", $patient->{DOB});
}        

Should be a Maximum 2 levels of Indentation

Function should contain only maximum 2 levels of code Indentation. It would be great, if the code indentation level is 1. But most of the time a simple "if" comparison or "for" loop pushes it to 2 levels which cannot be further avoided. If there is more than 2 level of Indentation then move the 2nd level indentation logic to separate function. This helps gives us below benefits:

  • Improves readability
  • Reduces cyclomatic complexity of the function
  • Easy to test and have high code coverage
  • Establishes "Single Responsibility" principle


Should be less than 10 lines

Sometimes even when the code indentation is 1 level or 2 levels, we could end up having a bigger function. Code is never easy to read when the function has more lines as the reader has to remember lots of code to understand the complete logic. Try to group related lines or related functions into a separate function, so when the developer reads the code from top-down, he doesn't have read through vast lines of code and instead he can read through the function names to understand logic. In other words, create abstraction by moving related logics to a separate function. When the reader wants to look at the details of any function, then he can deep dive into that function code but most of the time high level information provided by the function name should be suffice. So enable easy readability by using smaller functions.

In the previous example of GetProcessedPatientsData(), reader has to go through only 2-3 lines to understand the logic when he skims through the code. If he decides to understand the processing then he can deep dive into ProcessPatientData(). But with FetchPatientsData(), he has go through 7-8 lines of code to understand the logic. This is a small function, so it will not be an issue to go through 7-8 lines but when function is bigger then it will definitely hinder readability.

Function should have minimum number of arguments

Argument is at a different level of abstraction than the function name and it forces the reader to know the detail at the point which is not needed. Readers have to interpret the arguments each time they read it. Also it's difficult to understand which argument is input or output unless a strict nomenclature is followed around naming the variables. Ideal number of arguments for a function is zero but its not possible most of the time, so developer should try their level best to keep it very minimum.

Some of the suggestions to reduce or unnecessary arguments are:

  • Instead of passing reference variable to enable method to set the value, use return statements. Eg: String getFooter() is better than includeFooter(String data)
  • Break the method if there is a big list of arguments
  • Avoid flag arguments and instead create separate functions to handle the actions to be taken based on the flag value. More details: MartinFlower-FlagArgument


No Nested If and Loops

Nested loop and nested if statements are always BIG NO. Some of the reasons are avoiding them are:

  • Hinders readability
  • Increases cyclomatic complexity
  • Decreases testability

Second level loop can be always moved to a separate function.


Example: One of famous nested loop implementation is Bubble sort. Usual implementation of bubble sort looks like below.

Title: Cluttered code Code Language: Perl

void bubbleSort(int arr[], int n)
{
    int i, j;
    for (i = 0; i < n-1; i++)
        for (j = 0; j < n-i-1; j++)
            if (arr[j] > arr[j+1])
                swap(&arr[j], &arr[j+1]);
}        

Let us refactor the bubble sort function to remove nested loop.

Title: Refactored code Code Language: Perl

void bubbleSort(int intArr[], int len)
{
    int cur;
    for (i = 0; i < len-1; i++)
     fixLastMinusIElement(intArr, len-i);
}
void fixLastMinusIElement(int intArr[], int len)
{
    int j;
    for (j = 0; j < len-1; j++)
        sortCurAndNextElement(arr, j)
}
 
void sortCurAndNextElement(int arr[], int curInd)
{
    if (arr[curInd] > arr[curInd+1])
        swap(&arr[curInd], &arr[curInd+1]);
}; }        


Code should be Unit testable

Modularise the logic into a function such that all the logic inside the function is easily unit testable. In most of the cases, unit tests are given less importance and only very few cases are tested. Developers always tends to catch the remaining scenarios in later testing phases. This habit eventually affects both code quality and also project timeline due to endless debugging. Many hours of debugging can be avoided if functions are unit tested properly with positive, negative and edge cases. Unit test should always have 100% coverage. Though 100% test coverage is a much needed, it alone cannot ensure the code is working fine. Sometimes a simple positive case itself could give 100% coverage when the function is smaller. Developers have to understand the need and advantages of unit testing and do a comprehensive testing by including positive, negative and edge cases to uncover the issues.

In the refactored code, each functionality can be unit tested easily. Below are the benefits of writing a code considering the testability aspect:

  • Bugs reveal themselves and it will be difficult for bugs to hide
  • Easy to read and understand
  • Easy to scale
  • Improves re-usability


Example: Below code identifies the fruits which are present more than 1.

Title: Cluttered code Code Language: Perl

my @fruits= (ban, ban, ban , apple, cherry, apple);
sub GetAndPrintDuplicateItems {
    my %fruitscount;
 
    foreach my $fruit(@fruits) {
        $fruitscount{$fruit}++;
    }
 
    my @dups = grep { $fruitscount{$_} > 1 } keys %fruitscount;
 
    print @dups;
}        

In the above example, it's not easy to the foreach and grep statements as they are coupled. A better refactored code will look like below:

Title: Refactored code Code Language: Perl

my @fruits= (ban, ban, ban , apple, cherry, apple);
 
sub GetCountOfUniqItem {
    foreach my $fruit(@fruits) {
        $fruitscount{$fruit}++;
    }
    return \%fruitscount;
}
 
sub GetItemsWithCountGrtrThanOne {
    my ($fruitscount) = @_;
 
    return grep { $fruitscount->{$_} > 1 } keys %fruitscount;
}
 
sub GetDuplicateItems {
    my $fruitscount = GetCountOfUniqItem();
 
    return GetItemsWithCountGrtrThanOne($fruitscount);
}
 
sub PrintItems {
    my ($items) = @_;
 
    print $items;
}
 
my $dupitems = GetDuplicateItems();
PrintItems($items)        

Final thoughts:

Creating smaller functions might initially looks difficult as we would have been used to creating bigger functions. But it can be easily achieved with practice and even it improves the thought process to come up with logic of a problem in longer run. Developers might feel like creating smaller functions would make extrapolated number of functions. Benefits reaped for maintaining smaller functions always outweighs the issue of having more functions. Overall summary of benefits of smaller functions:

  • Easy to read/understand
  • Easy to maintain
  • Avoids unnecessary/frequent changes
  • Reduces the impact of changes to other parts of the program.
  • Reduces coupling between components which helps to create evolvable code
  • Easy to identify the bugs or bugs would reveal itself
  • Increases testability
  • Easy to scale
  • Avoids code duplication
  • Less cyclomatic complexity

Raja Savarimuthu

Technical Lead at Tech Mahindra

5 个月

Excellent content. However, in certain sections, the code seems to be written in C, though it's labeled as Perl.

Venkatesh Chandrasekharan

Sr. Architect |Executive Director Technology| Cloud Engineer | web specialist | Hands on technologist |

5 个月

Good one Thiru. When dealing with a new type of feature, its very natural for a developer to focus on getting things right first and ending up with a 200 line function. Once the unknowns are figured out, the most important thing to do is to dedicate the time to refactor the code to make it readable and maintainable . Your tips here give a lot of insights on how to do just that.

要查看或添加评论,请登录

Thirugnanasambandar Kuppusamy的更多文章

  • Designing a Stateful Application Elegantly

    Designing a Stateful Application Elegantly

    In the realm of software development, the design of stateful applications plays a crucial role in determining the…

  • The Role of Abstraction and Interfaces in Building Robust and Flexible Applications

    The Role of Abstraction and Interfaces in Building Robust and Flexible Applications

    Introduction In software development, creating applications that are robust, flexible, and welcoming to changes is…

  • Change - First Derivative of Software Development

    Change - First Derivative of Software Development

    In the realm of software development, change is not merely an aspect of the process; it is the very essence of…

  • Unit Tests Are Not 2nd Class Citizen!!!

    Unit Tests Are Not 2nd Class Citizen!!!

    Most of the time unit tests are treated as second class citizens compared to production code. The care, time and effort…

  • Comments Vs Readable code

    Comments Vs Readable code

    As a software engineer, we might have encountered poorly written code that is masked by comments. In software…

  • Error Handling

    Error Handling

    Error handling is one of the inevitable component of programming. Things can go wrong at any point and when it happens,…

社区洞察

其他会员也浏览了