The Joy of Non-Functional Requirements
Neil Schiller PhD, MBCS
Senior Manager - Commercial Transformation (Analysis) at Investec Wealth & Investment UK
Non-Functional Requirements, NFRs, meh. Nobody likes them. I've never met a Business Analyst who shows any enthusiasm when asked to look at them, and I include myself in that. They're the most painful type of requirements to gather, and they always result in some difficult conversations, some conflict somewhere that needs to be resolved. Mainly because they expose, in sharp relief, the gaps between the ideal world and the real world. As a rule of thumb, in the functional universe, pretty much anything is possible to achieve, somehow. In the non-functional universe, some things become practically, sometimes literally, impossible.
What are we talking about?
Strictly speaking, anything that your deliverable needs to provide that isn't going to be coded by a developer is a non-functional requirement. Which is problematic in itself because there's no easy way to identify them, or to know which ones are strictly relevant to your solution, or to know when to stop trying to gather them. NFRs are always a little bit of a grey area. They can include things like operational process changes and amendments to roles and responsibilities etc., but more generally they cover things such as security, useability, performance, reliability, any regulatory constraints that might apply, failover scenarios, and Service Level Agreements.
Despite being a real pain in the neck for a BA, they are a critical part of requirements elicitation. In a previous article I wrote, which attempted to describe what a Business Analyst does, I used a scenario about capturing requirements for building a car. Under NFRs for my car requirements I included questions about resilience, about whether the car would be used for short trips to the shops or on regular, long motorway journeys. The point being that the answer to these questions will determine some critical decisions to be made in the design process: engine size, fuel type and the like. A missed or misunderstood NFR is much more critical than a missed functional requirement. Code can be fixed or rewritten relatively easily. A server or an operating platform that you realise after the fact is fundamentally not fit for purpose is infinitely more problematic and costly to replace. That Fiat Punto with a one litre engine that is starting to fall apart after doing three hundred miles a day for a prolonged period is going to have to be scrapped eventually: there will come a point where you just can't really patch it up any more. You'll have to replace it with a two litre diesel. You could have started with the two litre diesel and saved yourself all manner of heartache.
Pitfalls
There are quite a few mistakes you can make with NFRs, but there is one more than any other that I come across time and time again. It's a performance requirement, specifically data latency, the speed with which a solution is required to perform a certain function within a data flow sequence. Stakeholders always want this to be 'as quickly as humanly possible'. Of course they do, why wouldn't they? But the trap is to then go and write up the requirement as "this is required in realtime". Gah. Realtime. What does that actually mean? Whenever I ask that I get looked at like I'm insane. Obviously it means immediately. And then I have to explain that there is no such thing as 'immediately'. Running a coded function takes time. Retrieving data takes time. Calculating something takes time, as does pushing data between systems. The more of these things that your function or string of functions need to do, the more time it will take. Now that may be something we're measuring in milliseconds, but it's just as likely something that we're measuring in whole seconds, or minutes.
So what? Well, in the old days of windows based software and batch processing, I'd agree there's a healthy dose of so what in there. But with web based applications, wireless devices, apps and websites, it's much more important. I don't want the process of buying something from Amazon to have big long periods of me sitting there waiting while a little loading icon spins round and round on my browser. Neither do you. If that keeps happening we'll start shopping somewhere else. And we're so impatient, we're not talking about waiting for several minutes before we get bored, we're talking four or five seconds before we decide this website is rubbish and give up on it.
Let me give an example I came across a few years back to illustrate how important it is to get this right. A company I was at used hand held terminals which would transmit data to a central system via a collection of data services. That data would then be further transited to a couple of other systems downstream of the central database, one of them an SMS gateway which sent texts to customers to inform them of the status of their order. I was asked to rewrite the NFRs that had been originally captured because they were causing an impasse between the company in question and the supplier that was building their solution. The reason for this was that the customer management team needed data to get from the hand held terminal out into a text message in under ten minutes. If not, then the text message was useless as it was no longer timely enough to warn the customer of when their order may be arriving with them. The original NFR was documented as 'realtime'. The supplier had said that was meaningless and nobody had bothered to rewrite it with a sensible target. So when the SMS messages were taking almost an hour to get out, the customer management team were understandably insisting that wasn't acceptable, and the supplier was insisting there was no benchmark against which to assess what acceptable meant.
Ok, so you'd be forgiven for thinking then that the answer to this problem is simply an NFR that states "an order update needs to trigger a text message within ten minutes of a scan being applied on the HHT". Except this is still not quite right, because that assumes the whole solution is always working at absolutely 100% efficiency at all times, forever. Meanwhile, in the real world, there are times where more scans are being applied, all at the same time, than at other times. Which means a greater load on the services, which means they work a little slower. There are times when one of those services falls over and needs to be fixed. While it's being fixed, the notifications that eventually need to get down into a text message are all just sitting there while the clock ticks. So really, the requirement needs to have two parts to it: how long the data should take to get to where it needs to be when everything is running smoothly, and the absolute maximum time the business can accept at those times when things aren't running smoothly. Typically you'd express it as "a target of ten minutes from HHT scan to SMS message in 90% of instances, and an absolute maximum of twenty minutes overall".
Conflict
As I said above, there is always an element of conflict when gathering NFRs. Typically because the business want something to work in under ten seconds but only have the budget for hardware and components that can't work that quickly. In these instances, you have to fall back on the NFR assessment process. This essentially works as follows:
- You capture the requirement. If it seems a bit challenging, you mention that, but you don't otherwise tell the stakeholder they have no chance. You just capture the requirement.
- You get someone technical (usually an architect) to assess the requirement. They tell you it's not achievable.
- You ask the architect to work out what is realistically achievable and you take that back to the stakeholder, explaining there is a technical constraint and this is as close as you can get to their requirement.
- The stakeholder tells you this isn't acceptable. They always say this. Essentially, at this point, you have to remind yourself this is a negotiation process, like haggling over the price of a watch with a market stall owner when you're on holiday.
- You go back to the architect and ask them to look at what it would take to meet the requirement. What would you need to do to get there? Critically, how much would it cost.
- You go back and tell the stakeholder how much it would cost to give them what they say they want. The stakeholder panics and asks what's the best you can do with the budget they do have.
- You get the architect to work out the best that can be achieved within the budgetary constraints and you play this back.
- The stakeholder agrees to this as it's better than what you originally said could be done, and it's not as scarily expensive as what you told them previously, so they can exit the process with a win.
Failover Requirements and Service Level Agreements
If things can't work at 100% efficiency 100% of the time, it logically follows that you might need to understand what the risk is to the business and your stakeholders when the solution drops below this benchmark. Is it acceptable, for example, if your web server goes down that your website is out of action for two hours while you fix it? If said website is the company's only or primary way of making money, then I would say probably not. But it's not just that kind of level that failover requirements should be defined for. Consider one of the projects I've worked on recently which was to collate data from a whole host of different sources into a single packaged deliverable for submission to an external agency. The team had concentrated, understandably, on the functional elements of this, in getting it coded and getting it working. But we were pulling from such an array of sources that invariably, every week, one of them wouldn't be there in time. It made sense to categorise the data into levels of criticality and to define a set of rules around when and if we would send the package depending on what was available and what wasn't. If the most important data the agency needed wasn't ready, we'd hold the whole thing back until it was. If some minor element of it wasn't there, you know what, they could live with that and we could send what we did have and circle back to the other stuff later.
Service Level Agreements, SLAs, are perhaps most important when your deliverable is being provided, and supported, by a third party. But they're pretty critical if it's all being done in-house too. Because if your website does go down, to return to that scenario, then you'll want to set a limit on the amount of money and reputation you can afford to lose before it's available again. In reality, every site goes down from time to time. It happens. I've even seen Google wobble once or twice. If you set the SLA for getting it back up and running as thirty minutes, then you can plan and design the support model that would enable you to achieve that. Or you can include a clause in your contract that enables you to claw back some of your lost revenue from the supplier who failed to meet the target you agreed.
Other Types of NFR
Let's not go into all the types of NFR I'm aware of because a) I'll be here forever, and b) it will almost certainly not be a complete list anyway. But let's look at one more: useability. Like the latency requirement above, this is an easy one to get massively wrong. I can't count the number of times I've seen this expressed as "the system needs to be user friendly". What on earth does that mean? "Easy to navigate" - as opposed to what? Hidden fields and buttons on a screen that you have to work out the location of via cryptic clues given out in morse code? Ok, I'm being facetious because actually this is a tricky one. You don't want to encroach on the design phase of the project, but at the same time you don't want to document platitudes that ultimately mean nothing.
I only tend to include useability NFRs if there is a clear need for them, some specific requirement that's arisen out of the need to support customers with poor vision, for example. Or, if there is a corporate standard on how things need to look, I might add an NFR that references a stylesheet or a set of design principles.
Ultimately, NFRs are just like functional requirements in one key respect: they need to be sufficiently detailed to be meaningful. If you write something down and then you find you can't articulate it without saying something like "you know what I mean?", or the word "obviously", then it's probably not detailed enough. Because nothing is obvious actually, and nobody automatically knows what you mean.
Business Analysis Centre of Excellence Practice Manager at The Nottingham Building Society
7 年Ahhh, NFRs.... Never once been easy!
Founder and Strategic Advisor
7 年Great article - there’s something incredibly satisfying about a good set of NFR’s.
Technology Program/Project Manager | Dependable diligent delivery.
7 年Another great article Neil that highlights why NFRs can be just as important as the FRs. I used to take my crib sheet along with me when I was a BA to NFR sessions. Raise NFR 1 and watch the bombs go off! Easily way more potent...