What is behind this number?
In data-driven conversations, numbers are used to support arguments. Some companies, like Amazon, nurture data-driven cultures, and I experienced this firsthand. As a Solutions Architect dealing with web performance, I had to dive into latency numbers to identify areas for improvement. Later, in a leadership role, I had to make decisions based on numbers (e.g., where to hire the next Solutions Architect), challenge numbers used in stakeholder arguments (e.g., revenue growth figures), and defend opinions using numbers (e.g., the impact of a report during year-end reviews or promotion processes).
Through these experiences, I saw how numbers can help make better decisions, but also how they can be weaponized by individuals or competitors to advance their own agendas. In this article, I will share several tips to be more resilient against weaponized numbers.
Tip 1 - Where is the data?
Let's start with the obvious: do not accept statements not backed by numbers. Challenge claims such as:
At Amazon, we call such vague adjectives "weasel words," and we encourage individuals to replace them with actual data.
Tip 2 - Where does the data come from?
To audit the quality and legitimacy of presented data, it's crucial to know its source. It's easy to fabricate numbers to sound data-driven in arguments. A data point that is not verifiable is not a reliable data point.
Tip 3 - How is is calculated?
Some data points are straightforward to understand when presented. For example, when you hear "revenue growth of 30% YoY," we all know how it's calculated. However, other data points can be trickier to understand, and the calculation method must be inspected.
For example, if someone claims, "The latency of your competitor's CDN service is 15% lower than yours," try to understand what they mean by "latency":
领英推荐
For the anecdote, the lesson about the importance of histograms in representing performance data was driven home by Jim Roskind , the inventor of the QUIC protocol, which today carries the modern and performant HTTP/3. At a specialist meeting in Seattle some years ago, Jim was invited to explain QUIC to us. To our surprise, he spent the majority of the time explaining how he set up latency data collection on Chrome browsers to build latency histograms. In the last 20 minutes, he explained how he used these latency histograms to actually develop the QUIC protocol and iterate on its performance. This approach underscores the critical importance of comprehensive data representation in performance analysis.
Tip 4 - Can you break it down?
Sometimes, you need to break down the provided number to better understand it. For example:
Tip 5 - How does it compare?
Finally, consider how the number compares to relevant benchmarks:
Recommended Reading
I recommend the book "Factfulness" by Swedish physician and statistician Hans Rosling (2018). While I don't entirely agree with its premise that the world is in a much better state than most people believe, the book provides valuable insights into the instincts that distort our perspective on the world and offers tools for more fact-based thinking.
I learned about this book from Arthur Petitpierre , a friend and a Principal SA at AWS, during ReInvent last year. Arthur has spent years in his career diving into latency numbers within his specialization in High Performance Computing ^^
By applying these tips and maintaining a critical mindset, you can become more resilient to potential misuse of data in decision-making processes.
Vice President and Distinguished Engineer at Amazon.com
7 个月Very nice posting! It is great to see you and others push so methodically for measurements, with deep dives into the metrics. You mentioned a talk I gave about the evolution of the QUIC protocol, which grew into HTTP/3. Here is a link to the 2016 pre-Amazon talk about QUIC and other real-world measurements, which was the basis of that Amazon talk: https://www.facebook.com/watch/?v=1695131504093280