PMTU Illus

IPv4/IPv6 Path MTU (PMTU) Discovery Demystified

Check out these additional IPv6 Resources:
Our IPv6 overview course at Udemy
Our IPv6 Custom Profiles for Wireshark
Our IPv6 classes at the Online School

A major change in the behavior of IP networks between IPv4 and IPv6 is that in IPv6 networks, all hosts are required to support something called Path MTU (PMTU) discovery.  I have listened to a lot of people talk about this over the years, and for those of us who have been working with IPv6 for a long time, many felt the requirement was long overdue.  The result of the requirement is simple, reduced, if not eliminated, need for fragmentation in the IP network.  More specifically, this means routers along a packet’s path will not need to fragment packets.

PMTU IllusIf you examine the illustration to the right, we see a path from Host1 to Host2, where all the links except one have an MTU of 1500 bytes or more.  The one link has an MTU of 1330 bytes.

In an IP network that does not support PMTU discovery, the second to last router will perform fragmentation of any packets greater than 1330 bytes.  Those packets must travel all the way to Host2 and Host2 will reassemble/de-ragment the packets as needed.

In a network that supports PMTU, Host1 sends its first packet assuming the MTU of it’s local connection (standard Ethernet would be 1500 bytes).  The router that would previously fragment, send an ICMP error message to Host1 and advises what the MTU should be.  Host1 updates its PMTU cache for that specific destination, and then sends any further packets to Host2 with the MTU of 1330.  This cache will keep the information for some number of minutes, and then forget the entry.

So you can’t block ICMP.

In IPv4 networks, the ICMP Type 3, Code 4 error occurred, but hosts did not have to listen to it, or do anything about it.  In some cases ICMP error like this were blocked.

RFC 1191 says “The basic idea is that a source host initially assumes that the PMTU of a path is the (known) MTU of its first hop, and sends all datagrams on that path with the DF [don’t fragment] bit set. If any of the datagrams are too large to be forwarded without fragmentation by some router along the path, that router will discard them and return ICMP Destination Unreachable messages with a code meaning “fragmentation needed and DF set” [7]. Upon receipt of such a message (henceforth called a “Datagram Too Big” message), the source host reduces its assumed PMTU for the path.”  The RFC goes on to suggest this should be kept for 10 minutes.

RFC 1981 provides details of how PMTU works with IPv6. Otherwise, most of the document is the same as RFC 1191.

Seems simple enough.  It begs several questions.

But how do we see this?  The answer is, it is all done auto-magically by your protocol stack/operating system.  There are no errors reported.  

Can we examine anything regarding PMTU?  The answer is absolutely, yes.  In Windows, the command for IPv4 is:

netsh interface ipv4 show destinationcache address 

I have put some sample outputs below. 

The command for IPv6 is:

netsh interface ipv6 show destinationcache address

To look at a specific address, just specify the address:

netsh interface ipv4 show destinationcache address={some IP address}

Here are the IPv4 and IPv6 response examples for Windows:

IPv4 IPv6
C:\Users\amwal>netsh interface ipv4 show destinationcache

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 1.2.3.4 192.168.1.1
1500 5.62.44.21 192.168.1.1
1500 5.62.44.22 192.168.1.1
1500 5.62.44.23 192.168.1.1
1500 5.62.44.103 192.168.1.1
1500 5.62.44.130 192.168.1.1
1500 5.62.44.204 192.168.1.1
1500 5.62.44.205 192.168.1.1
1500 5.62.44.206 192.168.1.1
1500 5.62.48.53 192.168.1.1
1500 5.62.48.54 192.168.1.1
1500 5.62.48.55 192.168.1.1
1500 5.62.48.200 192.168.1.1
1500 5.62.48.202 192.168.1.1
1500 8.8.8.8 192.168.1.1
1500 13.68.92.143 192.168.1.1
1500 13.84.220.179 192.168.1.1
1500 13.107.3.128 192.168.1.1
1500 13.107.6.254 192.168.1.1
1500 13.107.18.11 192.168.1.1
1500 13.107.21.200 192.168.1.1
1500 13.107.22.200 192.168.1.1
1500 13.107.42.12 192.168.1.1
1500 13.107.43.254 192.168.1.1
1500 13.107.246.254 192.168.1.1
1500 18.208.9.84 192.168.1.1
1500 78.46.39.215 192.168.1.1
1500 88.99.186.153 192.168.1.1
1500 91.213.143.254 192.168.1.1
1500 94.130.73.103 192.168.1.1
1500 216.58.194.144 192.168.1.1
1500 216.58.211.99 192.168.1.1
1500 216.58.212.131 192.168.1.1
1500 216.58.212.163 192.168.1.1
1500 216.239.32.116 192.168.1.1
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250

Interface 1: Loopback Pseudo-Interface 1

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 127.0.0.1 127.0.0.1
1500 239.255.255.250 239.255.255.250

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 192.168.206.255 192.168.206.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250

Interface 5: VirtualBox Host-Only Network

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 192.168.56.1 192.168.56.1
1500 192.168.56.255 192.168.56.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250
1500 255.255.255.255 255.255.255.255

Interface 23: Ethernet 4

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 169.254.9.161 169.254.9.161
1500 169.254.255.255 169.254.255.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250
1500 255.255.255.255 255.255.255.255

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 54.210.1.1 192.168.1.1
1500 199.59.148.23 192.168.1.1
1500 5.62.48.201 192.168.1.1
1500 23.209.119.248 192.168.1.1
1500 34.200.148.138 192.168.1.1
1500 34.214.245.56 192.168.1.1
1500 52.24.247.243 192.168.1.1
1500 52.202.83.221 192.168.1.1
1500 54.191.222.233 192.168.1.1
1500 94.130.104.85 192.168.1.1
1500 162.125.34.137 192.168.1.1
1500 199.59.150.46 192.168.1.1
1500 216.58.194.42 192.168.1.1

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 192.168.206.1 192.168.206.1

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 40.121.213.159 192.168.1.1

C:\Users\amwal>netsh interface ipv6 show destinationcache

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 fe80::d599:a404:8cac:a7e8 fe80::d599:a404:8cac:a7e8
1500 ff02::c ff02::c
1500 ff02::fb ff02::fb
1500 ff02::1:2 ff02::1:2
1500 ff02::1:3 ff02::1:3

Interface 1: Loopback Pseudo-Interface 1

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 ::1 ::1
1500 ff02::c ff02::c

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 fe80::30f2:dbe7:8279:f8af fe80::30f2:dbe7:8279:f8af
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::fb ff02::fb
1500 ff02::1:3 ff02::1:3

Interface 5: VirtualBox Host-Only Network

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 fe80::916e:62d5:6e25:2f5a fe80::916e:62d5:6e25:2f5a
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::1:3 ff02::1:3

Interface 23: Ethernet 4

PMTU Destination Address Next Hop Address
—- ——————————————— ————————-
1500 fe80::e536:f42d:d87a:9a1 fe80::e536:f42d:d87a:9a1
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::fb ff02::fb
1500 ff02::1:2 ff02::1:2
1500 ff02::1:3 ff02::1:3

 

We can see that all the PMTU settings are actually the same as mu Link Local MTU.  For me this is a good sign.  However, if I were experiencing slow response from a particular web site, I would definitely want to check out the PMTU to see if I was having to send small packets.

In Linux (Debian) you can use this command:

ip route get {some IP address}

You can also use the ‘tracepath’ command as the PMTU will be displayed:

tracepath -n {some IP address}

Here is an example:

 2018 11 30 12 28 52

 

RFC 4821 allows for the host to do two additonal things:

  1. Use regular PMTU discovery, however if no acknowledgments are received and no ICMP messages are received, start to “probe”
  2. Ignore regular PMTU discovery, and always probe

What does “probe” mean?  Probing is when the host sends a packet with the min MTU configured, and then attempts to increase that size. If acknowledgements are received on the larger size, then try increase it once again.

Option 1 will wait for a timeout so on broken PMTU discovery paths it starts a slowly. It will however use regular PMTU discovery whenever it can so it’s a lot more efficient.

Option 2 simple probes all the time. It starts faster on smaller MTU paths, but the server is also sending smaller packets to ALL paths in the beginning, so this is not as efficient.

Probing is something you can turn on in Linux.  To see whether it is turned on, use the following command:

sysctl net.ipv4.tcp_mtu_probing

or you can more generally use:

sysclt net | grep "prob"

Example output:

2018 11 30 12 47 27

My system is set to ‘0’.

You can change this by using:

sudo sysctl -w net.ipv4.tcp_mtu_probing=1

You can then explorte what happens in Wireshark.

My understanding from reading the Ubuntu docs on this is that whatever you have set for the IPv4 probing, the IPv6 stack mimicks the setting.

Another thing you may be interested in is the amout of time the MTU setting is kept in the cache.  You can look at this in Debian Linux with:

sudo sysctl net | grep "mtu_expires" 

You will see 600 seconds – which translates to 10 minutes:

2018 11 30 13 25 06

I hope you find this article and its content helpful.  Comments are welcomed below.  If you would like to see more articles like this, please support us by clicking the patron link where you will receive free bonus access to courses and more, or simply buying us a cup of coffee!, and all comments are welcome! 

 

 

Leave a Comment

Contact Us Here


Please verify.
Validation complete :)
Validation failed :(
 
Your contact request has been received. We usually respond within an hour, but please be patient. We will get back to you very soon.
Scroll to Top