Latency metrics

Latency measures are crucial for checking how well your apps and services perform. Latency means the total time it takes for a piece of data to go from where it starts to where it ends up, usually on a network. When we talk about latency, we’re mostly talking about how fast things move in a network. It’s one of the main things we look at to see if a service is good or not. We usually measure it in milliseconds. The lower the latency, the better the user’s experience.

By looking at P90, P95, and P99 latencies, you can find where things might be slowing down, make things better for users, and make sure your systems work really well for most people.

Metrics

P99 (99th percentile): The P99 metric indicates that 99% of requests are completed within the recorded latency. As an example, if we say that our application has a P99 latency of less than or equal to 5 milliseconds, then we mean that 99% of calls are serviced with a response under 5 milliseconds.

P95 (95th percentile): P95 latency reveals that 95% of requests fall below the specified threshold.

P90 (90th percentile): This metric signifies that 90% of requests are completed within the given latency value, while the remaining 10% took longer.

The P99 percentile is a metric we use to monitor and enhance the overall network latency or the response time of our application. Percentiles serve as a tool to distinguish between unusual occurrences and typical performance patterns. Network administrators strive to optimize the P99 percentile of network latency to enhance the overall responsiveness, particularly during periods of high demand.


Dipesh Bhakat

Microsoft || Salesforce || ServiceNow || Oracle || Jadavpur University || Harvard Business School Online

10 个月

Well said

回复

要查看或添加评论,请登录

Jayesh Tanna的更多文章

  • SDK vs. API

    SDK vs. API

    Recently, I joined the Python SDK team, which has given me a unique perspective on the world of SDKs. Having previously…

    2 条评论
  • Database sharding

    Database sharding

    Data partitioning, or sharding, involves dividing a large database into smaller pieces. This helps improve how the…

  • Kubernetes Resource Quota and LimitRange

    Kubernetes Resource Quota and LimitRange

    Kubernetes allows you to manage your application in numerous ways. Consider that your users spread across multiple…

  • PACELC theorem

    PACELC theorem

    In any distributed system, different kinds of failure can happen like network loss or device failure in a machine etc…

  • Business Continuity and Disaster Recovery (BCDR)

    Business Continuity and Disaster Recovery (BCDR)

    What is Business continuity and disaster recovery? BCDR represents a set of approaches or processes that helps a…

  • System design: Chat messenger like WhatsApp

    System design: Chat messenger like WhatsApp

    What is Chat messenger? Now a days, we are all using one or other kind of personal chat messenger like WhatsApp or…

  • Consistency patterns

    Consistency patterns

    In distributed system, to achieve availability, we write data at multiple places. It is possible that server could go…

  • System design : pastebin.com

    System design : pastebin.com

    About pastebin.com User can paste or write or store text for the specific period of time and the same content can be…

  • SQL or NoSQL

    SQL or NoSQL

    There are two mainly two types in the world of databases: SQL and NoSQL (or relational databases and non-relational…

    1 条评论
  • Differences Between Push And Pull CDNs

    Differences Between Push And Pull CDNs

    Content delivery networks (CDNs) are most useful when we want to serve static files to our users like CSS, JS, HTML or…

社区洞察

其他会员也浏览了