Home » Networking » Troubleshooting Shoretel Switch Communication across a Site to Site VPN with Cisco ASA’s – Fragmentation and MTU size.

Troubleshooting Shoretel Switch Communication across a Site to Site VPN with Cisco ASA’s – Fragmentation and MTU size.

Scenario:  Shoretel IP Phone system deployed across multiple sites.  One remote site connects into HQ with a Site-to-Site VPN over the internet using Cisco ASAs as the termination points.  The HQ site can call the remote site using extension dialing.  Calls from the remote site to HQ fail and hit the failover DID of the user or receive the ‘Extension cannot be reached at this time’ backup auto attendant message.  THIS IS NOT A ONE WAY AUDIO ISSUE.  Audio works in both directions if the call sets up and is established.  This scenario is a failure of the call to set up even though IP connectivity *appears* to be there.

Topology

Shoretel Topology Scenario

Shoretel Topology

 

Running basic ICMP and LSP_pings prove that full connectivity exists between the two switches (10.10.10.11 and 10.10.20.11).  Since these work, I need to look deeper at the packet flow to get a sense of what is going on. I used two ways to do this – using etherMon on the Shoretel switches and ‘capture‘ filters on the ASA’s. 

Running etherMon from a Shoretel Switch.

etherMon allows you to do a full packet capture on Shoretel switches.  This blog post details it very cleanly.  http://netdungeon.com/packet-capture-shoretel-shoregear-switches

In the .pcap files from running etherMon while making test calls, I could see packets of MTU 1514 bytes leaving the Shoretel switch in the remote site, but never arriving in the HQ site.  I could see all the other UDP traffic hitting the HQ side.  I don’t have a cleaned up screenshot of these .pcaps.

Looking back to the network.

This pointed me away from Shoretel configuration and to look at the ASA’s and verify the MTU / Fragmentation settings.  Per Shoretel KB14603:

ShoreTel versions 7.5 and below had a maximum transmission unit of 1380 bytes.  Versions 8 and above increased the size of the MTU to 1472 for ShoreSIP messages.  Larger MTU sizes were needed to accommodate new features in the ShoreTel switches. One important consideration when deploying an IPsec VPN is how to deal with maximum transmission unit (MTU) and fragmentation issues caused by large IPsec packets. IPsec adds considerable overhead to the IP packets that it encapsulates which can cause fragmentation and dropped packets in the network.  Most IPSEC VPN devices can be set to allow fragmentation and reassembly in the network.

 I confirmed the MTU on both firewalls was set to 1500.

REMOTE-ASA# show run mtu
mtu inside 1500
mtu outside 1500
HQ-ASA# show run mtu
mtu inside 1500
mtu outside 1500

So, if Shoretel needs 1472 and IPSEC has a ~46 byte overhead, that means I need to be able to pass 1472+46 = 1518 bytes.  I cant fit 1518 bytes into a 1500 byte packet, so I’ll have to fragment the packets.

Per the KB, Shoretel creates packets with the ‘Do Not Fragment’ bit.  We can override this in the firewalls.

crypto ipsec df-bit clear-df inside
crypto ipsec df-bit clear-df outside

This should allow the Shoretel packets to be fragmented.  Let’s set up some packet captures in the ASA’s and make some test calls!

Capturing Packets on the ASAs

First we need to define some interesting traffic.  I want to see all TCP/UDP packets into and out of the SG switch at 10.10.20.11 heading to the HQ location.  I add this ACL to both firewalls.

access-list calls extended permit tcp 10.10.10.0 255.255.255.0 host 10.10.20.11
access-list calls extended permit udp 10.10.10.0 255.255.255.0 host 10.10.20.11
access-list calls extended permit tcp host 10.10.20.11 10.10.10.0 255.255.255.0
access-list calls extended permit udp host 10.10.20.11 10.10.10.0 255.255.255.0

Now we build the packet capture on that ACL on both firewalls.

 capture capture_call access-list calls interface inside

Then make the test call from REMOTE to HQ. Review the results on each firewall.

show capture capture_call detail

I got the same results here. I could see the traffic leaving the REMOTE site, but never arriving on the HQ site.  I needed better captures and ACLs.  I need to see something more about WHY these packets are being dropped and WHERE.

no capture capture_call
no access-list calls extended permit tcp 10.10.10.0 255.255.255.0 host 10.10.20.11
no access-list calls extended permit udp 10.10.10.0 255.255.255.0 host 10.10.20.11
no access-list calls extended permit tcp host 10.10.20.11 10.10.10.0 255.255.255.0
no access-list calls extended permit udp host 10.10.20.11 10.10.10.0 255.255.255.0

You can use ASP to get some more detail on what’s going on with packets traversing your ASA.  More here (http://www.fir3net.com/Cisco-ASA/what-is-asp-and-how-to-troubleshoot-asp-drops-on-an-asa.html)

capture DROPS type asp-drop all
show cap DROPS
no cap DROPS !to get rid of it.

This gave alot of information but nothing particularly helpful to this situation that I saw. I really needed to get a capture on the inside and outside interface at the same time, even if the outside interface is encrypted via IPSEC.

!This one works fine for the calls on each firewall LAN side.
access-list calls extended permit tcp 10.10.10.0 255.255.255.0 host 10.10.20.11
access-list calls extended permit udp 10.10.10.0 255.255.255.0 host 10.10.20.11
access-list calls extended permit tcp host 10.10.20.11 10.10.10.0 255.255.255.0
access-list calls extended permit udp host 10.10.20.11 10.10.10.0 255.255.255.0
!This one should get all traffic from each firewall to the other on the outside interface.
access-list peer extended permit tcp host 2.2.2.2 host 1.1.1.1
access-list peer extended permit tcp host 1.1.1.1 host 2.2.2.2
access-list peer extended permit udp host 2.2.2.2 host 1.1.1.1
access-list peer extended permit udp host 1.1.1.1 host 2.2.2.2
access-list peer extended permit ip host 2.2.2.2 host 1.1.1.1
access-list peer extended permit ip host 1.1.1.1 host 2.2.2.2
!Then we build the inside and outside captures. (structure found here (http://isamology.blogspot.com/2009/12/asapix-packet-capture-feature.html))
capture pi interface inside access-list calls circular-buffer buffer 10000000
capture po interface outside access-list peer circular-buffer buffer 10000000

From here, we should be good to make a test call and check the caps.

show capture pi
show capture po

This is it! I can see the large packets leaving the REMOTE-INSIDE, REMOTE-OUTSIDE, hitting the HQ-OUTSIDE and THEN JUST DISAPPEARING!

At that point, I knew it wasn’t a carrier issue and was something in the firewall – either a bug or configuration issue.  I pored through line after line of code in the ASA until I hit a few lines I didn’t recognize and didn’t add in myself.

fragment chain 1 outside

Huh, that sounds interesting.  Fragment.  Quick search yields this. ASA Packet Captures

By default, the ASA allows up to 24 fragments per IP packet, and up to 200 fragments awaiting reassembly. . .To set disallow fragments, enter the following command:
hostname(config)# fragment chain 1 [interface_name]

Sounds EXACTLY like the problem we’re seeing, right.  I ran a quick ‘ no fragment chain 1 outside‘ and now calls work just fine in both directions!

This was a tough one, with a ton of packet tracing on numerous devices to confirm settings and what what actually traversing the networks.  I’ll know to look for this first in the future.  You can check quickly now by doing a ‘show run | inc fragment’.  Good luck!

TLDR; The Cisco ASA has a command called ‘ fragment chain 1 outside’.  If you see this, any fragmented packets will be discarded.  Remove it with ‘no fragment chain 1 outside’.