The Dashboards Every Software Leader Should Watch
This time of year is performance review season for many companies. It is a reminder to us as engineering leaders, we need to be objective in the assessment of our individual performance, the performance of our teams, the health of the systems we built, and the business impact we drove. We take great pride in supporting our direct reports, watching them grow, seeing them succeed, and that pride can lead to performance reviews that aren’t always as objective as they should be. Doing this well is not easy! It requires data, and presentation of that data in an easily consumable way. It is very easy to lapse into subjectivity. In this article I will share how I try to ground my organization in data. I'll share best practices around presentation of that data, and finally the dashboards I have come to rely on every day. But before I do, let me start with an introductory aviation metaphor.
Do you have the right gauges?
I love flying. On a commercial flight, you’ll always find me nestled into a window seat, staring out that tiny porthole, observing the well-choreographed flow of air traffic around airport terminals, admiring the intricate mechanics of the wing as the spoilers bleed airspeed, and guessing which direction passing planes are flying based on the colors of their navigation lights. There is so much amazing engineering at work.
I remember being a kid and thinking that the barrier between our atmosphere and outer space was like the surface of a swimming pool. I thought that if we only flew just a tiny bit higher, we would cross a magical threshold where the blue sky transitions abruptly into black space, much like how the ambient sounds of a summer day are quickly muffled by the gurgling sound of water when you jump into a pool. But climbing in a plane to 40,000 feet now, I can appreciate how the sky’s transition to a darker sapphire blue is gradual. Beautiful.
Airlines want to fly high in that sapphire blue sky because they get much better fuel economy at those altitudes. And that improved economy translates to longer range, allowing them to connect passengers between more distant cities. Not only is being altitude aware important for fuel efficiency, but it is also paramount to safety. At 40,000 feet, planes flying in opposite directions are vertically separated by at least one thousand feet (in North America, all westbound flights fly at odd-numbered altitudes, like 39k or 37k feet, and all eastbound flights take the even-numbered altitudes like 38k, 40k feet).?
So how do pilots know that they’re at the right altitude? Imagine, for a second, that you asked a pilot this question and he answered, “well I take off, fly really high, and just see if it feels right.” As a passenger, you would probably want to get off that plane as quickly as possible, or say, “what about all those gauges!” Yes, pilots use gauges called altimeters to judge their altitude. But the common altimeter doesn’t actually measure true altitude, it measures air pressure. And, much to the surprise of my childhood self, air pressure decreases gradually as you ascend in the air, which makes this pressure-sensitive gauge a pretty good approximation of altitude. Before taking off, pilots have to calibrate these altimeters based on local weather reports in order to improve their accuracy and achieve safer flying.
This leads me to the topic of this post: How do you know that your software engineering team is flying at the right altitude? How do you know that they’re executing well, that their systems are healthy, and that they’re delivering impact? As an engineering leader, what gauges are you looking at to answer these questions?
All too often, I hear managers admit they don’t have a good way to track these things. For a North American team of 8 engineers plus a PM and a manager, this is like having two-million dollars of annual budget to procure software without knowing whether or not those two million dollars will even buy what you want. I don’t know about you, but I don’t have two million dollars to lose, so if my organization is spending that kind of resources on something, I feel a great responsibility to make sure that I’m not that fictitious pilot who says, “I just take off and see if it feels right.”
So how do you measure your team’s output? At this point, I bet most readers are thinking one of these things:
Those are all good, but are they enough? Let’s ask a few questions to test how well those tools are serving you in your responsibility to your business:
If you think you already have everything you need to answer these questions, then you can probably stop reading here. But my guess is that most of the off-the-shelf dashboards in the software you use every day are insufficient. As a manager you need better “gauges” to really know where you stand. Unfortunately, you may need to build these yourself (or at least configure them).
What does good look like?
Building methods to track answers to all of the questions above is a daunting task. So first, I want to start with some design principles that I try to follow any time I build a dashboard.
Write down what you want before you use what’s available. There are many dashboarding tools embedded in the systems we’ve grown accustomed to. At Wayfair, we make extensive use of Jira (for project management), PagerDuty (for incident management), and Datadog (for application performance monitoring). Each system has dashboards that provide decent “defaults,” but the danger with defaults is that they have none of the domain knowledge that you possess. These tools aren’t aware of what success looks like for your business—they’re only aware of what’s common across the industry. So I would recommend starting with capturing what you want to get out of a dashboard, what questions you want answered every time you look at them. Only after doing that should you attempt to use and ultimately customize these tools. Otherwise, you will get comfortable viewing the default dashboards in these systems and lose sight of things that matter more (for example, you may end up just using a default service dashboard to track HTTP errors, and miss the fact that your service’s primary endpoint has been sending empty JSON responses along with 200s for every client request—the error rate may look fine, but the API is not serving your clients and may be impacting business performance).
Keep high level take aways at the top, then progressively provide more detail until you reach the root data element as you reach the bottom of a dashboard. The idea here is that your key take-aways are always right at the top. If the data looks good and doesn’t provoke any further questions then you’re done viewing the dashboard. If the high-level charts at the top provoke further inquiry, then all you need to do is scroll down for more detail. As you reach the bottom of a dashboard, it should be easy to view the source data elements so that you can better understand why metrics at the top exhibited certain behavior. For example, if your dashboard shows Jira ticket delivery, and you’re investigating why a team had an anomalously low delivery against their goal for the sprint, the very bottom part of that dashboard could show a simple table with Jira tickets that carried over to the next sprint, perhaps along with other signals like the original estimate, vs revised estimate or even actuals. Take the opportunity to link rows in your dashboard to the source of truth, so in this case, the dashboard would link to the actual Jira issues, thereby reducing any doubt in the quality of the data presented on the dashboard.
Always visualize within a larger context. If I’m showing a single metric, I also want to show also how that metric has changed over time. For example, if I were to show a burn down chart, I might instead opt to show many burn down charts (in small multiples), or even a sprint-over-sprint burn down so that a viewer can quickly and easily compare over time. A good dashboard allows you to objectively view the data not only in its current state, but also be able to compare against a previous state without having to use your own memory, or shift context (such as by filtering or adjusting time windows).
Enable drilling down without interactions. Dashboards often contain area or bar charts that are a single color, and the only way to understand the largest contributing factors to anomalies in the chart is to filter or drill into a specific subcategory. When users filter to drill down, the original chart often needs to refresh, causing the user to lose context of what they were previously looking at. Instead, design your charts and dashboards to use colors or labels to highlight different subcategories from the start. For example, let’s imagine that you’re trying to understand why there was a spike in bugs in November by looking at a bar chart of bugs by month. If your chart broke down the bugs by root cause category, you might be able to see that the spike in November could be attributed to a specific category rather than all categories, without needing to filter and risk losing context. Any time you want to chart a single metric, ask yourself what the next question you would ask yourself if you saw an anomaly in that metric, and consider adding a series that answers the secondary question.
Reduce visual noise as much as possible. If you’re doing everything listed above, you’re going to have a lot going on in your dashboard. To make it more visually digestible, reduce the visual noise. Charts in most visualization tools come with lots of “chartjunk” (credit to Edward Tufte for coining the term), in the form of unnecessary gridlines, labels with harsh contrast (for example, using gray for lines and labels is usually more pleasing than pure black text on a white background), borders on shapes (when a simple fill color would suffice). Be critical of everything that comes for “free” in a chart in Google Sheets, Tableau, or Looker. Reduce the quantity or contrast of items in the chart that are not the data itself and let the data speak for itself.
Use semantic coloring. Many charting systems come with defaults that do not understand your data. But our brains have learned that red is bad, yellow is caution, and green is good. Light gray can be used for context in a de-emphasized portion of a chart, where blue would be the focal part of the chart. It’s also a good thing to vary the brightness between red and green for any viewers with a red-green color vision deficiency (not to mention, it makes for legibility when printing in a black and white printer if you need to do that).
领英推荐
What dashboards do you need?
I like to imagine my responsibilities as an engineering leader as 3 legs of a stool. If any of the legs fails, the stool falls down:
System health
You need to know the health of your teams’ systems. There are many off-the-shelf solutions that do this today, in the form of application performance monitoring (APM) products. These days they bundle logging, network traces, intelligent alerting, custom dashboarding, session recording, service cataloging, SLI and SLO setting, and even AI integrations. Given the richness of these offerings, it is too easy to lose sight of the fact that these systems still do not know your business like you do. While these systems might be able to tell you that there’s an anomalous error spike, a node running out of memory, or an increase in latency, they can’t tell you which is your most important service and what happens to your business when it goes down. For this reason, my teams have tailored their dashboards to start with business KPIs tracked in real-time at the top, followed by error signals, latency, and resource signals at the bottom. Too often I see dashboards that start with the resource visualizations (for example, kubernetes pod health, memory consumption, and high CPU utilization charts) because they’re available off the shelf, and they omit the other two. By omitting the other two, teams are unable to contextualize an outage or estimate the business impact of it.
Another challenge with many APM solutions is the sheer proliferation of dashboards, many of which end up unmaintained, and the reality is that in many larger enterprises, your team is just one among many. Your team’s systems may depend on 5 other teams’ systems, or it might be the dependency of another team’s system. I would recommend that your team keep links (either inside your dashboard or in an operational run book) to the top consumers’ and dependencies’ dashboards so that it is quick and easy for anyone on call to seek the root cause of an incident more quickly, or understand the broader impact. Imagine how helpful it would be if every team’s dashboard included links between consumers and dependencies. This would make discovering the health of a complex enterprise’s systems very easy.
Team health
For the second leg of the stool, I want to make sure that my team is high performing. One of the dashboards that I watch weekly tracks the output of my teams at the Jira ticket and Github Pull Request levels. It provides summary-level metrics with the ability to drill down by team plus the ease of access provided by direct links to Jira issues and Github PRs. With this view, I can see how the number of story points per person per sprint has changed over the last year in order to assess changes in our efficiency. I can see if the accuracy of our estimates improved, stayed the same, or got worse. As our team has launched newly replatformed experiences, I can see if the bug tax has reduced as much as we hoped it would. I can see how the time to merge a pull request has trended up or down, or if the time to commit is a significant driver of that time to merge.
Additionally, the dashboard helps to facilitate 1-1s I have with my team members when I meet them. With a little bit of preparation, I’m able to see what work they completed, what kinds of blockers they ran into. I can see if any of their work was canceled (as this would be a source of frustration for many). I can easily pivot to Github to see what kinds of feedback they are receiving in code reviews. But most importantly, I can come into the 1-1 with better knowledge of the complexity of their most recent projects.
Business health
However, work accomplished is not necessarily progress. In sailing, there is a navigation term, Velocity Made Good (VMG). Just because you are moving fast in some direction, it doesn’t necessarily mean that you’re moving in the right direction (since, after all, in sailing you need to zig-zag towards your destination if it is upwind). So, to estimate our velocity made good, I might look at two measures: 1) the average points per person per sprint on work that is directly supporting the business, and 2) the key results are we actually moving.
I am a big believer in OKRs. However, it’s very easy to do them poorly. Any OKR that doesn’t have a live dashboard plotting the objective against its key results doesn’t count as an OKR in my book. Like many organizations, mine has suffered from all of the following OKR syndromes and therefore had a lower than desired Velocity Made Good:
Although I don’t show an example of how to visually track these, I suggest that all teams do. I also recommend following the recommendations in the next section in order to promote team awareness of their OKRs and constant reminders of where they stand against them.
Many have been inspired to follow in the OKR footsteps of Measure What Matters by John Doerr . It is a fun read that motivated me to get smarter about OKRs. But if you’re looking for more on-the-ground field advice for how to put OKRs into practice, I would recommend Objectives and Key Results: Driving Focus, Alignment, and Engagement with OKRs by Paul Niven and Ben Lamorte .
Reading your gauges
I will wrap up with a few best practices that I like to follow. First, don’t just set and forget. Your dashboards are meant to be viewed, explored, and improved over time. Make sure that you are a regular visitor to your dashboards. Share them with your partners (for example, your Product Management, Experience Design, and Analytics counterparts), but also your direct reports. Be transparent with the themes and challenges you see surfaced in the data, and ask for your team’s help to make improvements in your organization or in your systems. And don’t just share them, but subscribe to them. Make sure you’re automatically receiving copies of these dashboards in your inbox, Slack, or whatever communication system your team uses.
Beware of the pitfall of instrument fixation. A behavioral trap that some pilot trainees fall into is that they become overly reliant on their instruments. They end up losing the “feel” of the plane. They ignore signals that are directly observable by looking out the window. Gauges fail. Sometimes they are inconsistent with each other. As leaders, we are required to use our own critical thinking, and sometimes this means questioning the data you see without taking it at face value. So, while I recommend becoming fluent in the data around your systems, your people, and your business, I also would caution anyone from becoming overly fixated. For example, we know that there are many ways to approximate a team’s productivity (PRs merged, lines of code written, story points delivered, number of goals accomplished), but like the altimeter in a plane, we need to recognize that these are just approximations, and sometimes anomalies have good explanations.
So, with performance season upon us, I want to take a fresh look at the dashboards that I watch weekly. What stories do they tell me about my individual performance? Was my organization high-performing? Were our systems resilient in 2023? Did my teams and their systems deliver impact to the business? As I look forward to 2024, how will we move the needle in different ways? What changes do I need to make to my dashboards to ensure that I’m delivering the value that the business needs from me?
iOS Engineer / Tech Lead / Engineering Manager
1 年Still learning from you Rob! Thanks for posting this great article.
Nice job, Rob Truxler !!
CTO / VP of Engineering | SaaS | AI-Powered Data Products | Copilots & Agents | Ex-Amazon/Audible/TripAdvisor
1 年Great article Rob Truxler! I like to track the ratio of points-committed to points-delivered over time for each team. That becomes a canary for software delivery risk, under-estimation.
Software Engineering Director | Transformational Leader Driving Success in Software Development
1 年Great article, Rob! The part at the end, about not overly relying on the metrics, resonated with me. In my experience, most of the time the data cannot tell the whole story, and just starts the conversation to uncover the insights, which is still great!