登录查看更多内容

A guide to diagnosing network issues using MTR

Priyanka Shyam

Network Geek with a robust skill set | CCDE (Written) | CCIE | CWNA | Cisco SCOR | Cisco SD-WAN Expert | Technical Writer | Multitasker | Considerate & Empathic Communicator

发布日期: 2022年8月15日

In my previous article, I discussed the two very important diagnostic monitoring tools of ping and traceroute. However, one tool that offers extra features is MTR (formerly known as Matt's Traceroute; now My Traceroute).

With MTR, administrators can diagnose and isolate network errors and report network status to upstream providers.?In MTR, the Ping and traceroute utilities are combined to provide a robust tool for troubleshooting. Since MTR combines both Ping and Traceroute tools, we will look at these two utilities individually first (I have already mentioned these two utilities in my previous article) and see if they can be used for troubleshooting.

The most common tool for testing network connectivity is ping. The sender sends ICMP echo request packets (ICMP type 8 code 0) to the receiver, and the receiver replies with ICMP echo reply packets (ICMP type 0 code 0) if it is available.

ICMP packets are used to test contention and traffic between two points on the Internet by Networking diagnostic tools such as ping, traceroute, and MTR. In order to ping a host on the Internet, a user sends ICMP packets. The host sends packets in response. As a result, the client of the user can calculate the round trip time between two points on the Internet.

Note that ping is not always accurate - a firewall may be in the path between the sender and the receiver, filtering ICMP packets. Thus, a host is not unavailable just because it does not respond to ICMP.

Depending on your operating system, ping works differently. By default, ping sends four packets and ends by itself on Windows OS. In Unix-based systems and MacOS, the ping will run until you stop it (using CTRL+C). With the -t option, you can also run a continuous ping on Windows OS.

Traceroute

Ping and traceroute work differently. Unlike ping, traceroute tells you the path between the sender and receiver. You can use this especially if you have administrative control over the entire network.

On Unix-based systems and MacOS, Traceroute sends UDP packets from the sender to the destination. Traceroute uses ICMP echo requests on Windows systems. The command to invoke traceroute on Windows is "tracert" whereas on most other operating systems it is "traceroute".

MTR

Let's now examine what MTR is now that we've seen the two utilities that make it up. Unlike Ping and Traceroute, which are enabled by default on most systems, MTR may require installation.

In the same way that you run ping and traceroute, you run MTR by using the mtr command followed by the destination address.

You get real-time connectivity information when you run MTR, since it continuously polls the destination (and devices in the path). By pressing CTRL+C or the Q key, you can stop it at any time.

Let's mtr at 8.8.8.8

According to the output above, MTR combines ping (RTT and packet loss) with traceroute (devices in the path). On your network, you can determine the following:

You know there is connectivity between source and destination if the MTR successfully reaches the destination. However, if it cannot reach the destination, it does not mean there is no connectivity - there could be something blocking the path. Later, we'll discuss other options.

Packet Loss:

If there are too many packet losses between the source and destination, you may need to further troubleshoot. There can be packet loss along the path between source and destination as some devices may be rate limiting (ICMP rate limiting or filtering) packets used by ping, traceroute, and MTR.

In general, ICMP Rate Limitation is configured to prevent DDOS attacks.?A built-in Deniel-Of-Service protection mechanism limits the number of transmitted ICMP packets out an interface.?Due to the destination router silently discarding the second packet, one ICMP destination unreachable message is sent every 500 milliseconds (1/2 second).

Round-Trip Time:

Your link may be malfunctioning if packets are taking too long to get from source to destination. The distance between source and destination could also be quite large.

Report Mode

The default interactive mode of MTR can result in a large number of packets being sent continuously, which can have a negative impact on network performance. Thus, we can run MTR in "report" mode, which sends 10 packets by default to each device and displays the network statistics:

8.8.8.8 mtr -report

You enable report mode by using two hyphens (-) followed by report, i.e. -report. This report was generated using mtr --report 8.8.8 This uses the?report?option, which sends 10 packets to the IP address 8.8.8.8?and generates a report. MTR will run continuously in an interactive environment without the --report option. Each host's round trip time is reflected in the interactive mode. The --report mode provides sufficient data in a useful format in most cases.

A hop is represented by a numbered line in the report. To reach their destination, packets pass through hops. Reverse DNS lookups determine the names of the hosts (e.g. a72-247-36-1.deploy.stati and xe2-3-0.hh-sjc5-a.netarch in the example). MTR provides valuable statistics regarding the longevity of the connection in the seven columns following the path packets travel between servers. Each hop's loss percentage is shown in the Loss% column. Packets sent are counted in the Snt column. With the --report option, you will send 10 packets, unless you specify --report-cycles=[number-of-packets], where [number-of-packets] is the number of packets you want to send.

The next four columns?measure latency in milliseconds (e.g. ms): Last, Avg, Best, and Worst.?Last?is the latency of the last packet sent,?Avg?is the average latency of all packets, while Best and Worst display the best (shortest) and worst (longest) round trip times. Most of the time, you should focus on the average (Avg) column.

Each host's standard deviation is shown in the last column, StDev. There is a greater difference between latency measurements when the standard deviation is higher. The standard deviation allows you to determine whether the mean (average) provided represents the true center of the data set, or if it has been skewed due to a phenomenon or measurement error. Inconsistent latency measurements are indicated by a high standard deviation. Averages of the latencies of the 10 packets sent appear normal, but may not represent the data accurately. Take a look at the best and worst latency measurements if the standard deviation is high to make sure the average is a good representation of the true latency.

Increase Test Speed

When you run MTR, it will use reverse DNS to resolve IP addresses to hostnames by default. If you are not interested in DNS or do not use DNS on your network, this can slow down your troubleshooting process. With -n or -no-dns, we can disable DNS resolution.

mtr -r -n 8.8.8.8

MTR also sends successive packets every second. When a network is operating normally, this may be fine. During congestion, packets usually arrive faster. To simulate a congested network, we can use the -i option (or -interval) to specify how often MTR should send packets:

mtr -r -i 0.1 4.2.2.2

In the path between the source and some devices, I now notice some packet loss with a shorter interval between packets and sending 50 packets.

Analyze the MTR reports

Verify the packet loss

Two things should be considered when analyzing MTR output: loss and latency. There may be a problem with that particular router if you see a percentage of loss at a particular hop. It is common practice among some service providers to rate limit MTR's ICMP traffic. Consequently, it may appear that packets are lost when they are not. Take a look at the subsequent hop to determine if the loss you're experiencing is real or due to rate limiting. You are likely seeing ICMP rate limiting rather than actual loss on that hop if that hop shows a loss of 0.0%:

The loss reported between hops 1 and 2 is likely due to rate limiting on the second hop. The remaining eight hops all touch the second hop, but no packets are lost. The loss may be caused by packet loss or routing issues if it persists for more than one hop. Rate limiting and loss can occur simultaneously. To determine the actual loss, take the lowest percentage of loss in the sequence:

领英推荐

Network Troubleshooting: A Comprehensive Guide (Part -…

Md Abu Sayed 6 个月前

Migration from Cisco ASA to Palo Alto using Expedition…

??AM ??IXIT ? 5 个月前

7 Network Protocols Every IT Engineer Should Master…

Kevin Meneses 1 个月前

Between hops 2 and 3 and between hops 3 and 4, there is a 60% loss. No subsequent host reports zero traffic loss, so you can assume the third and fourth hops are losing some traffic. However, several of the final hops only experience 40% loss due to rate limiting. Always trust the reports from later hops when different loss amounts are reported.

Problems with the return route can also explain some loss. It is not uncommon for packets to reach their destination without error, but they have trouble returning. If you have an issue, it is often best to collect MTR reports in both directions.

A route's latency may also be affected by the connection quality.?A high latency is shown in the following MTR report:

Network Latency

You will also be able to assess the latency of a connection between your host and the target host with MTR.?The number of hops in a route always increases latency. It is important, however, that the increases are consistent and linear. It is unfortunate that latency is often relative and is highly dependent on both the quality of the host's connection and the physical distance between them.

Between hops 3 and 4, latency jumps significantly and remains high. Considering that round trip times remain high after the fourth hop, we can assume, based on the data, that the latency might be caused by a poorly configured router or a congested link.

It is unfortunate that high latency does not always indicate a problem with the current routing. Reports like the one above indicate that traffic is still reaching the destination host and returning to the source host despite some sort of issue with the 4th hop. The return route could also cause latency. Your MTR report will not show the return route, and packets can take completely different routes to and from a destination.

There is a large jump in latency between hosts 3 and 4, but the latency does not increase unusually in subsequent hops. As a result, we can assume that the 4th router is malfunctioning.

Like packet loss, ICMP rate limiting can also create the appearance of latency:

Initially, the latency between hops 4 and 5 stands out. The latency, however, drops dramatically after the fifth hop. Here, we measured a latency of 40 milliseconds. As a result, MTR draws attention to an issue that does not affect the service in such cases. The latency to the final hop should be considered when evaluating an MTR report.

An incorrect configuration of the destination host's network

An incorrectly configured router appears to cause a 100% loss to the destination host in the next example. It appears that the packets are not reaching the host, but this is not true.

Traffic reaches the destination host. In spite of this, the MTR report shows loss because the destination host is not responding. The loss can be caused by improperly configured networking or firewall (iptables) rules that drop ICMP packets. If the hop shows 100% loss, it is a misconfigured host. MTR does not attempt additional hops based on previous reports. Without a baseline measurement, it is difficult to isolate this issue, but these types of errors are quite common.

Routers for residential or business use

Reports from residential gateways can sometimes be misleading:

100% loss at the second hop does not indicate a problem. On subsequent hops, there is no loss

Incorrectly configured ISP router

Your packets may never reach their destination if a router on the route your packet takes is incorrectly configured:

When there is no additional route information, the question marks appear. Poorly configured routers can send packets repeatedly. The following example illustrates that:

According to these reports, the router at hop 4 is not configured correctly. In these situations, the only way to resolve the issue is to contact the source host's network administrator.

ICMP Rate Limiting

The purpose of ICMP rate limiting is generally to prevent DDoS attacks.?A built-in Deniel-Of-Service protection mechanism limits the rate of ICMP packets transmitted out of an interface.?Since the destination router silently discards the second packet, 1 in 3 requests from the destination appear as a timeout because the default value is one unreachable message per 500 milliseconds (1/2 second).

There can be apparent packet loss caused by ICMP rate limiting. ICMP limiting causes packet loss to one hop that does not persist to subsequent hops. The following example illustrates this:

The following points should be kept in mind:

Techniques for advanced MTR

Newer versions of MTR can now run in TCP mode on a specified TCP port instead of using ICMP (ping) by default. In most cases, this mode should not be used because TCP reports can be misleading. The TCP MTR uses SYN packets instead of ICMP pings, and most internet-level routers won't respond, erroneously reporting loss.

The purpose of a TCP test is to determine whether firewall rules on a router somewhere are blocking a protocol or port, perhaps due to improper port forwarding settings. A TCP test over a certain port would reveal this more clearly than an ICMP test.

MTR vs traceroute: what's the difference?

The traceroute command (tracert in Windows) prints the route packets taken in a TCP/IP network.

Three UDP packets with a TTL of 1 are sent by the command traceroute hostname. Upon arriving at the closest router, the TTL value is decreased by one, making it 0. Traceroute notes the IP address of the router that sends back the 3 ICMP packets with a TTL value of 0 when it notices a packet with TTL value 0. After calculating the time to receive each packet, it sends out three more UDP packets with a TTL value of 2.

MTR, however, combines the functionality of 'traceroute' and 'ping'. The MTR program investigates the network connection between the host on which it runs and the host on which it is running, as soon as it starts. It sends ICMP ECHO requests to each machine after determining the address of each network hop to determine the quality of the link between them. MTTR uses ICMP Time Exceeded (type 11) packets returning from routers, or ICMP Echo Reply packets once they reach their destination. The process prints out running statistics about each machine.

What makes mtr faster than traceroute

A primary reason for this is the way traceroute runs. UDP (or ICMP on Windows) packets are sent to the first host with TTL of one, and when the host replies with a timeout (or passes an internal timeout), the next packet is generated with a TTL of two, and so on. For each host, the traceroute's total time includes sending and receiving packets sequentially.

The MTR sends all the ICMP ECHO packets in parallel once it determines the path the packets will take.

By contrast, tools like traceroute and MTR send ICMP packets with incrementally increasing TTLs to view the route or series of hops between origin and destination. TTL, or time to live, controls how many hops a packet must make before it dies. MTR assembles the route that traffic takes between hosts on the Internet by sending packets and watching them return after one hop, two hops, and three hops.

Advantage of MTR over ping or traceroute

In comparison to ping or traceroute, MTR shows exactly where packet loss occurs in the route to the destination host. In addition to showing the loss percentage for each host, it gives us valuable insight into which specific provider is experiencing a problem with their network. Moreover, since MTR uses ICMP ECHO requests, it will go through routers that block UDP traffic. MTR may work where traceroute does not.

Kousik Roy

Engineer | Pushing bugs to Production

1 个月

Great Blog Priyanka Shyam every aspect of MTR explained in simplified manner helpful.

1 次回应

Varsha Kohirkar

SRE

4 个月

what does it mean when the nimber of hops are fluctuating between 1 and max number of hops for some time and recovers?

Stanley Russel

1 年

It's no surprise that network diagnostics are essential to keeping a network running smoothly. I'm interested in hearing how others are using traceroute and ping to troubleshoot issues. Do you have any unique techniques that you've found to be especially effective? Additionally, I wonder what tools or processes network engineers are using to proactively monitor network performance?

1 次回应

Vishal Singh

IT Systems Engineer at Zscaler | Ex- DXC Technology| (DM for a Referral)

2 年

1 次回应

Pavel Odintsov

On mission to deliver affordable DDoS protection

2 年

Great content. Love it. Just checked few more your articles and they're excellent.

1 次回应

查看更多评论

要查看或添加评论，请登录

Priyanka Shyam的更多文章

Designing a Data Center

2024年9月13日

Designing a Data Center

Happy Friday!! I have seen people asking quite a few times about designing a data center from scratch in interviews, in…
Distribute-list and Redistribute in Routing

2024年9月9日

Distribute-list and Redistribute in Routing

Happy Monday!! In the realm of networking and routing, understanding the nuances of commands like distribute-list and…

2 条评论
Routing Table Codes

2024年9月8日

Routing Table Codes

Happy Friday!! Understanding routing table codes is crucial for network engineers to efficiently manage and…

4 条评论
Implicit and Explicit Denial Rule in Firewall

2024年2月13日

Implicit and Explicit Denial Rule in Firewall

Happy Tuesday!! The purpose of this article is to discuss implicit denial and explicit denial within a firewall, and…
Spine and Leaf data center design.

2023年12月20日

Spine and Leaf data center design.

The topic of today's post is spine and leaf data center design. I would like to highlight a bit about east-west and…
All About Multicast IP Range

2023年9月12日

All About Multicast IP Range

Happy Tuesday!! In this post, we will discuss IP addressing for multicast applications. Multicast applications use an…

1 条评论
ASDM "this app won't run on your computer" - Windows 10

2023年9月1日

ASDM "this app won't run on your computer" - Windows 10

Happy Friday!! As we all know, Cisco Adaptive Security Device Manager (ASDM) is software that enables users to manage…

3 条评论
How Do Internet Bandwidth And Speed Differ?

2023年6月1日

How Do Internet Bandwidth And Speed Differ?

People often confuse bandwidth with speed. Some people believe that there is no difference between internet speed and…

14 条评论
How does HTTP Tunneling work?

2023年5月24日

How does HTTP Tunneling work?

We all know that http method includes GET,POST,PUT,CONNECT,OPTIONS,TRACE,DELETE. We have already discussed the…

2 条评论
A guide to creating self-signed certificates

2023年5月11日

A guide to creating self-signed certificates

During the SD-WAN implementation in my lab, I had to create the Root CA, generate CSRs, and generate self-signed…

3 条评论

See all articles

A guide to diagnosing network issues using MTR

Priyanka Shyam

Network Geek with a robust skill set | CCDE (Written) | CCIE | CWNA | Cisco SCOR | Cisco SD-WAN Expert | Technical Writer | Multitasker | Considerate & Empathic Communicator

领英推荐

Priyanka Shyam的更多文章

社区洞察

其他会员也浏览了

Troubleshooting Palo Alto IPsec VPN Authentication and Connectivity Issues (2024)

Diagnosing Network Issues with MTR

Top Best Practices for Seamless FortiGate Firmware Updates

?Unlocking Network, Application, and Security Insights with Fusion's SD-WAN Integration with Wireshark

Troubleshoot Common Network Issues like a Pro!

FortiGate – MTU & TCP-MSS Troubleshooting

1/3 - RADIUS vs TACACS+, who wins this battle?

How To Monitor Any Device With SNMP

#1.MikroTik RouterOS

Some important "Networking Buzzwords/Concepts" we should know

领英推荐

Priyanka Shyam的更多文章

Designing a Data Center

Distribute-list and Redistribute in Routing

Routing Table Codes

Implicit and Explicit Denial Rule in Firewall

Spine and Leaf data center design.

All About Multicast IP Range

ASDM "this app won't run on your computer" - Windows 10

How Do Internet Bandwidth And Speed Differ?

How does HTTP Tunneling work?

A guide to creating self-signed certificates

社区洞察

其他会员也浏览了

Troubleshooting Palo Alto IPsec VPN Authentication and Connectivity Issues (2024)

Diagnosing Network Issues with MTR

Top Best Practices for Seamless FortiGate Firmware Updates

?Unlocking Network, Application, and Security Insights with Fusion's SD-WAN Integration with Wireshark

Troubleshoot Common Network Issues like a Pro!

FortiGate – MTU & TCP-MSS Troubleshooting

1/3 - RADIUS vs TACACS+, who wins this battle?

How To Monitor Any Device With SNMP

#1.MikroTik RouterOS

Some important "Networking Buzzwords/Concepts" we should know