MTR - Understanding and Troubleshooting Network Connectivity

Verified and Tested 12/23/15

Introduction

MTR (originally, Matt’s TraceRoute, now just My TraceRoute) is a handy, lightweight tool in a UNIX/Linux administrator’s arsenal that can help to identify and diagnose common network issues such as latency, packet loss, and routing errors. It is a powerful 2-in-1 tool that combines and displays the results of a traceroute and a ping with one command. Let’s go over the basics of using MTR and how to interpret the data it provides.

Installing MTR

MTR packages are available for most of today’s popular Linux (or UNIX-based) operating systems.

Install MTR on Ubuntu/Debian:

sudo apt-get install mtr-tiny

The mtr-tiny package is the command-line-only version of the mtr package. The mtr package includes support for the X-11 graphical interface.

Install MTR on CentOS/Fedora:

yum install mtr

Install MTR on Arch Linux:

pacman -S mtr

Install MTR on FreeBSD:

pkg install mtr

How MTR Works

To understand the output that MTR generates, you might find it helpful to know how it works. If you’re already acquainted with how the traceroute command works, then this explanation will sound familiar.

MTR generates an ICMP Echo Request packet destined for the target IP/hostname of your mtr command. The first packet will have a time-to-live (TTL) value of 1. When that packet arrives at the router that is its gateway on its path to its eventual destination, the receiving router will decrement the TTL by 1, making it 0. When a TTL reaches 0, the router drops the packet and sends the original sender an ICMP Time Exceeded packet. This return packet contains the sender’s IP address, and MTR displays this IP (or hostname, if it can resolve it) as the first hop. Then it sends a separate ICMP Echo Request packet with a TTL of 2, and when it receives the ICMP Time Exceeded packet from the TTL expiration, it lists this device as the second hop, and so on until the destination returns an ICMP Echo Reply packet.
.

Reading MTR Reports

In addition to listing each network hop between the originator and destination, MTR also keeps track of statistics related to the round-trip time for packets from the originating host to each hop along the way. This round-trip time is often called latency.

To get a better idea of better idea of what MTR tells us, let’s take a look at an example that traces the route to Google’s public DNS.

sudo mtr -r 8.8.8.8

    [sample results below]

    HOST: endor                       Loss%   Snt   Last   Avg  Best  Wrst StDev
     1. 69.28.84.2                    0.0%    10    0.4   0.4   0.3   0.6   0.1
     2. 38.104.37.141                 0.0%    10    1.2   1.4   1.0   3.2   0.7
     3. te0-3-1-1.rcr21.dfw02.atlas.  0.0%    10    0.8   0.9   0.8   1.0   0.1
     4. be2285.ccr21.dfw01.atlas.cog  0.0%    10    1.1   1.1   0.9   1.4   0.1
     5. be2432.ccr21.mci01.atlas.cog  0.0%    10   10.8  11.1  10.8  11.5   0.2
     6. be2156.ccr41.ord01.atlas.cog  0.0%    10   22.9  23.1  22.9  23.3   0.1
     7. be2765.ccr41.ord03.atlas.cog  0.0%    10   22.8  22.9  22.8  23.1   0.1
     8. 38.88.204.78                  0.0%    10   22.9  23.0  22.8  23.9   0.4
     9. 209.85.143.186                0.0%    10   22.7  23.7  22.7  31.7   2.8
    10. 72.14.238.89                  0.0%    10   23.0  23.9  22.9  32.0   2.9
    11. 216.239.47.103                0.0%    10   50.4  61.9  50.4  92.0  11.9
    12. 216.239.46.191                0.0%    10   32.7  32.7  32.7  32.8   0.1
    13. ???                          100.0    10    0.0   0.0   0.0   0.0   0.0
    14. google-public-dns-a.google.c  0.0%    10   32.7  32.7  32.7  32.8   0.0

MTR reports, by default, display the following columns:
– Loss% = The percentage of packets for which an ICMP reply was not received.
– Snt = The number of packets sent to each hop.
– Last = The round trip time of the last traceroute probe, in milliseconds.
– Avg = The average round-trip time of all traceroute probes, in milliseconds.
– Best = The shortest round-trip time of all traceroute probes, in milliseconds.
– Wrst = The longest round-trip time of all traceroute probes, in milliseconds.
– StDev = The standard deviation probe results to each hop.
.

Getting a Live Report from MTR

If you run MTR with just a target IP (or hostname), you’ll get a live report that will keep going until you end your session or run the break command (Ctrl+C).

sudo mtr 8.8.8.8

Some operating systems require sudo before to run the mtr command; others do not.

.
If you’d like to pause MTR, press p. MTR will preserve all the counts collected while paused, allowing you to take a screenshot or to copy the data to your clipboard. Unpause MTR with the spacebar.
.

Generating a Fixed-Count MTR Report

You can also generate an MTR report after a specific number of trace probes with the -r option (long form is --report). The default number of counts is 10, but if you’d like to run MTR for a different number of counts, use the -c (--report-cycles) option as well. For example, if you would like to generate a report over 200 counts:

sudo mtr -rc 200 8.8.8.8

[long form]
sudo mtr --report --report-cycles 200 8.8.8.8

Any MTR run with the -r option will not produce any output (unlike the live report above) until it completes the full number of counts. MTR sends a series of trace probes once a second by default, so a report should complete in just over a number of seconds equal to your count number (200 seconds in the above example).
.

Other Useful Options

There are several other options you might find useful while using MTR. Those that do no require arguments (such as -r) can be chained together in the same option string (e.g., -rn). An option requiring an argument can be included in one of these chains only if it is the last option, and it is followed by its argument (e.g., -rnc 200).

-n: (long form --no-dns) Disable DNS hostname lookups. The n key can also be used during a live report to toggle between disabling and enabling DNS lookups.
-u: Send UDP datagrams instead of ICMP echo request packets. The u key can also toggle between UDP and ICMP during a live report.
-w: (long form --report-wide) Substitutes for -r but produces a report that does not truncate longer hostnames.
-i*: (long form --interval) Specify the interval, in seconds, between test probes. The default interval is 1 second.
-4: Restrict test to IPv4.
-6: Restrict test to IPv6.

Analyzing MTR Data for Latency

The output of MTR data can help you to identify issues you or one of the internet carriers may be having with routing. It can also help to set a baseline for expected latency between endpoints.

By running an MTR report over an extended period, you can get an idea of what ordinary round-trip times look like between two endpoints. The more counts you allow MTR to run, the more accurate your average latency result is likely to be. If you run the default 10 count, one stray result could throw off your average to a significant degree; run 100, 500, or 1000 counts, though, and the occasional delayed packet won’t have nearly so dramatic an effect on the overall average.

Let’s reiterate here that the times generated by MTR are the round-trip times for an ICMP packet to reach the hop at which its TTL expires, for the device processing that expiration to generate an ICMP Time Exceeded packet, and for that packet to return to the originating device. For many routers, performing the ICMP response for dropped packets is a low priority–and on some devices, it’s disabled entirely.

For one of these reasons, you will likely see instances where one intermediate hop shows the occasional spike in the “Last” or “Worst” time column. As long as the jump in latency doesn’t also propagate to each subsequent hop, then what you are seeing is the delay from the response mechanism on that one device, as opposed to true throughput latency. Take, for example, the following MTR output:

                                                             Packets               Pings
 Host                                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. vl223-ar-01.nyc-ny.atlantic.net                         0.0%    66    0.5   6.1   0.5 140.2  25.5
 2. te0-0-1-1.rcr11.ewr04.atlas.cogentco.com                0.0%    66    1.0   1.0   0.8   2.8   0.2
 3. te0-3-0-4.rcr21.ewr02.atlas.cogentco.com                0.0%    66    1.1   1.1   0.9   2.5   0.2
 4. be2601.rcr24.jfk01.atlas.cogentco.com                   0.0%    66    1.6   1.7   1.5   2.0   0.0
 5. be2632.ccr42.jfk02.atlas.cogentco.com                   0.0%    66    1.7   1.8   1.6   3.0   0.1
 6. be2807.ccr42.dca01.atlas.cogentco.com                   0.0%    66    8.3   8.1   7.7  12.1   0.6
 7. be2113.ccr42.atl01.atlas.cogentco.com                   9.1%    66   27.5  21.7  18.6  34.7   3.9
 8. be2123.ccr22.mia01.atlas.cogentco.com                   0.0%    66   33.0  33.4  33.0  41.5   1.1
 9. be2055.ccr21.mia03.atlas.cogentco.com                   0.0%    66   33.3  33.6  33.1  36.3   0.4
10. 38.104.95.170                                           0.0%    65   40.8  40.9  40.7  42.0   0.1
11. 209.208.7.42                                            0.0%    65   41.6  43.3  40.9 187.9  18.2
12. [target host]                                           0.0%    65   41.1  41.1  40.9  41.3   0.0

See how the first hop and the eleventh hop each have a worst time much higher than their averages? Many look at an indicator like this one and assume that it represents evidence of throughput latency. But notice how the second and twelfth hops don’t also show a similarly worst time? If the worst time column for each subsequent hop showed the same or greater time, then we could take that result of an incident pointing to potential latency issues. Note, in contrast, the average time column, particularly between hops 6 and 7. The average jumps from 8.3 milliseconds to 21.7 milliseconds, and each subsequent hop has a higher number. This column shows an example of true latency, in this case between routers in Washington, D.C, and Atlanta, GA (this result is pretty normal by 2015 standards).

You may also see intermediate hops sporadically or even consistently dropping packets altogether. Again, as long as these drops are isolated to the one device and not consistent across all subsequent hops, then it’s very likely a symptom of the ICMP Time Exceeded response messages being deprioritized for more important transit traffic (you can see an example of this drop in hop 7 above). In some cases, network administrators configure routers not to reply with any ICMP Time Exceeded responses at all. You may see these hops show up as dropping 100% of traffic while hops beyond it are still responsive (in the first example in this article, you can see an example of this behavior in hop 13 showing 100% loss and not showing its hosts IP or hostname).
.

What Next?

This article is only an introduction to how you can use the MTR tool to examine your network connectivity to various endpoints across the Internet. While there is plenty more to learn, the information presented here should give you a good start to being able to diagnose network issues you may be experiencing. Thank you for reading, and please check back with us again for further updates and more advanced VPS hosting tutorials and articles like What is: Networking Basics – Switches, Routers, and Firewalls.

Learn more about our VPS hosting services and Virtual private servers.
.
.

Atlantic.Net Blog

MTR – Understanding and Troubleshooting Network Connectivity

Verified and Tested 12/23/15

Introduction

In This Article

Installing MTR

How MTR Works

Reading MTR Reports

Getting a Live Report from MTR

Generating a Fixed-Count MTR Report

Other Useful Options

Analyzing MTR Data for Latency

What Next?

Get a $250 Credit and Access to Our Free Tier!