Networking/Computing Tips/Tricks

A major change in the behavior of IP networks between IPv4 and IPv6 is that in IPv6 networks, all hosts are required to support something called Path MTU (PMTU) discovery.  I have listened to a lot of people talk about this over the years, and for those of us who have been working with IPv6 for a long time, many felt the requirement was long overdue.  The result of the requirement is simple, reduced, if not eliminated, need for fragmentation in the IP network.  More specifically, this means routers along a packet's path will not need to fragment packets.

PMTU IllusIf you examine the illustration to the right, we see a path from Host1 to Host2, where all the links except one have an MTU of 1500 bytes or more.  The one link has an MTU of 1330 bytes.

In an IP network that does not support PMTU discovery, the second to last router will perform fragmentation of any packets greater than 1330 bytes.  Those packets must travel all the way to Host2 and Host2 will reassemble/de-ragment the packets as needed.

In a network that supports PMTU, Host1 sends its first packet assuming the MTU of it's local connection (standard Ethernet would be 1500 bytes).  The router that would previously fragment, send an ICMP error message to Host1 and advises what the MTU should be.  Host1 updates its PMTU cache for that specific destination, and then sends any further packets to Host2 with the MTU of 1330.  This cache will keep the information for some number of minutes, and then forget the entry.

So you can't block ICMP.

In IPv4 networks, the ICMP Type 3, Code 4 error occurred, but hosts did not have to listen to it, or do anything about it.  In some cases ICMP error like this were blocked.

RFC 1191 says "The basic idea is that a source host initially assumes that the PMTU of a path is the (known) MTU of its first hop, and sends all datagrams on that path with the DF [don't fragment] bit set. If any of the datagrams are too large to be forwarded without fragmentation by some router along the path, that router will discard them and return ICMP Destination Unreachable messages with a code meaning “fragmentation needed and DF set” [7]. Upon receipt of such a message (henceforth called a “Datagram Too Big” message), the source host reduces its assumed PMTU for the path."  The RFC goes on to suggest this should be kept for 10 minutes.

RFC 1981 provides details of how PMTU works with IPv6. Otherwise, most of the document is the same as RFC 1191.

Seems simple enough.  It begs several questions.

But how do we see this?  The answer is, it is all done auto-magically by your protocol stack/operating system.  There are no errors reported.  

Can we examine anything regarding PMTU?  The answer is absolutely, yes.  In Windows, the command for IPv4 is:

netsh interface ipv4 show destinationcache address 

I have put some sample outputs below. 

The command for IPv6 is:

netsh interface ipv6 show destinationcache address

To look at a specific address, just specify the address:

netsh interface ipv4 show destinationcache address={some IP address}

Here are the IPv4 and IPv6 response examples for Windows:

IPv4 IPv6

C:\Users\amwal>netsh interface ipv4 show destinationcache

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 1.2.3.4 192.168.1.1
1500 5.62.44.21 192.168.1.1
1500 5.62.44.22 192.168.1.1
1500 5.62.44.23 192.168.1.1
1500 5.62.44.103 192.168.1.1
1500 5.62.44.130 192.168.1.1
1500 5.62.44.204 192.168.1.1
1500 5.62.44.205 192.168.1.1
1500 5.62.44.206 192.168.1.1
1500 5.62.48.53 192.168.1.1
1500 5.62.48.54 192.168.1.1
1500 5.62.48.55 192.168.1.1
1500 5.62.48.200 192.168.1.1
1500 5.62.48.202 192.168.1.1
1500 8.8.8.8 192.168.1.1
1500 13.68.92.143 192.168.1.1
1500 13.84.220.179 192.168.1.1
1500 13.107.3.128 192.168.1.1
1500 13.107.6.254 192.168.1.1
1500 13.107.18.11 192.168.1.1
1500 13.107.21.200 192.168.1.1
1500 13.107.22.200 192.168.1.1
1500 13.107.42.12 192.168.1.1
1500 13.107.43.254 192.168.1.1
1500 13.107.246.254 192.168.1.1
1500 18.208.9.84 192.168.1.1
1500 78.46.39.215 192.168.1.1
1500 88.99.186.153 192.168.1.1
1500 91.213.143.254 192.168.1.1
1500 94.130.73.103 192.168.1.1
1500 216.58.194.144 192.168.1.1
1500 216.58.211.99 192.168.1.1
1500 216.58.212.131 192.168.1.1
1500 216.58.212.163 192.168.1.1
1500 216.239.32.116 192.168.1.1
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250

Interface 1: Loopback Pseudo-Interface 1

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 127.0.0.1 127.0.0.1
1500 239.255.255.250 239.255.255.250

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 192.168.206.255 192.168.206.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250

Interface 5: VirtualBox Host-Only Network

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 192.168.56.1 192.168.56.1
1500 192.168.56.255 192.168.56.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250
1500 255.255.255.255 255.255.255.255

Interface 23: Ethernet 4

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 169.254.9.161 169.254.9.161
1500 169.254.255.255 169.254.255.255
1500 224.0.0.22 224.0.0.22
1500 224.0.0.251 224.0.0.251
1500 224.0.0.252 224.0.0.252
1500 239.255.255.250 239.255.255.250
1500 255.255.255.255 255.255.255.255

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 54.210.1.1 192.168.1.1
1500 199.59.148.23 192.168.1.1
1500 5.62.48.201 192.168.1.1
1500 23.209.119.248 192.168.1.1
1500 34.200.148.138 192.168.1.1
1500 34.214.245.56 192.168.1.1
1500 52.24.247.243 192.168.1.1
1500 52.202.83.221 192.168.1.1
1500 54.191.222.233 192.168.1.1
1500 94.130.104.85 192.168.1.1
1500 162.125.34.137 192.168.1.1
1500 199.59.150.46 192.168.1.1
1500 216.58.194.42 192.168.1.1

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 192.168.206.1 192.168.206.1

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 40.121.213.159 192.168.1.1

C:\Users\amwal>netsh interface ipv6 show destinationcache

Interface 15: Wi-Fi

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 fe80::d599:a404:8cac:a7e8 fe80::d599:a404:8cac:a7e8
1500 ff02::c ff02::c
1500 ff02::fb ff02::fb
1500 ff02::1:2 ff02::1:2
1500 ff02::1:3 ff02::1:3

Interface 1: Loopback Pseudo-Interface 1

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 ::1 ::1
1500 ff02::c ff02::c

Interface 10: VMware Network Adapter VMnet0

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 fe80::30f2:dbe7:8279:f8af fe80::30f2:dbe7:8279:f8af
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::fb ff02::fb
1500 ff02::1:3 ff02::1:3

Interface 5: VirtualBox Host-Only Network

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 fe80::916e:62d5:6e25:2f5a fe80::916e:62d5:6e25:2f5a
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::1:3 ff02::1:3

Interface 23: Ethernet 4

PMTU Destination Address Next Hop Address
---- --------------------------------------------- -------------------------
1500 fe80::e536:f42d:d87a:9a1 fe80::e536:f42d:d87a:9a1
1500 ff02::c ff02::c
1500 ff02::16 ff02::16
1500 ff02::fb ff02::fb
1500 ff02::1:2 ff02::1:2
1500 ff02::1:3 ff02::1:3

 

We can see that all the PMTU settings are actually the same as mu Link Local MTU.  For me this is a good sign.  However, if I were experiencing slow response from a particular web site, I would definitely want to check out the PMTU to see if I was having to send small packets.

In Linux (Debian) you can use this command:

ip route get {some IP address}

You can also use the 'tracepath' command as the PMTU will be displayed:

tracepath -n {some IP address}

Here is an example:

 2018 11 30 12 28 52

 

RFC 4821 allows for the host to do two additonal things:

  1. Use regular PMTU discovery, however if no acknowledgments are received and no ICMP messages are received, start to "probe"
  2. Ignore regular PMTU discovery, and always probe

What does "probe" mean?  Probing is when the host sends a packet with the min MTU configured, and then attempts to increase that size. If acknowledgements are received on the larger size, then try increase it once again.

Option 1 will wait for a timeout so on broken PMTU discovery paths it starts a slowly. It will however use regular PMTU discovery whenever it can so it’s a lot more efficient.

Option 2 simple probes all the time. It starts faster on smaller MTU paths, but the server is also sending smaller packets to ALL paths in the beginning, so this is not as efficient.

Probing is something you can turn on in Linux.  To see whether it is turned on, use the following command:

sysctl net.ipv4.tcp_mtu_probing

or you can more generally use:

sysclt net | grep "prob"

Example output:

2018 11 30 12 47 27

My system is set to '0'.

You can change this by using:

sudo sysctl -w net.ipv4.tcp_mtu_probing=1

You can then explorte what happens in Wireshark.

My understanding from reading the Ubuntu docs on this is that whatever you have set for the IPv4 probing, the IPv6 stack mimicks the setting.

Another thing you may be interested in is the amout of time the MTU setting is kept in the cache.  You can look at this in Debian Linux with:

sudo sysctl net | grep "mtu_expires" 

You will see 600 seconds - which translates to 10 minutes:

2018 11 30 13 25 06

We hope this helps and welcome questions of comments!

 

 

Comments powered by CComment

Find by Tag

4G Networks 5G Networks 6LoWLAN 6LoWPAN 802.11 802.11ah 802.11ax 802.11ay 802.11az Ad-Hoc Addressing Analysis Ansible Architecture ARP Assessment AToM Automation Baseline BGP Bloom's Taxonomy Cable cat CellStream Cellular Central Office Cheat Sheet Chrome Cisco Cloud CMD Company Policy Computer Consulting Data Center Data Networking Dependencies DHCPv6 DNS Docker Documentation Dublin-Traceroute dumpcap Earth Earthquakes ECMP Ethernet Ethics Etiquette Evaluation Field Operations Fragmentation G-MPLS Gauge GeoIP GNS3 Google GQUIC Hands-On History Home Network ICMP ICMPv6 IEEE 802.11p IEEE 802.15.4 India Internet IoT IPv4 IPv6 IRINN IS-IS L2VPN L3VPN LDP LifeNet Linux LLN LoL M-BGP MAC Macro Microsoft Milky Way mininet Monitoring MPLS mtr MTU Multicast Murphy Name Resolution Netcat NetMon netsh Networking nmap NSE Observations OLPC Online School OpenFlow OSPF OSPFv2 OSPFv3 OSX OTT Paris-Traceroute Parrot PIM PMTU Policy POTS POTS to Pipes PPP Profile Project Management PW3E QoS QUIC Railroad Remote Desktop Requirements Resume Review RIP Routing RPL RSVP Rural SDN Security Service Provider Small Business SONET Speed SSL Status Storms Subnetting SYSCTL T-Shark TCP TCP/IP Telco Telecom 101 Telecommunications Telephone Testing Tools Traceroute Traffic Engineering Training Travel Tunnel Ubuntu Utility Video Virtualbox Virtualization VoIP VRF VXLAN Wi-Fi Wi-Fi 4 Wi-Fi 5 Wi-Fi 6 Windows Wireless Wireless 5G Wireshark WLAN Writing Zenmap ZigBee

Twitter Feed