After having NSX running in a nested environment, I started last week to integrate / built a NSX environment between my physical and nested ESXi hosts. To be honest, achieving this was more complicated than I have expected. Anyway it was a good trip to improve my NSX troubleshooting skills and maybe the key-findings can help one or another to avoid the problems I had.
From a logical-level my goal was pretty straight forward. I have 3 physical (vSAN) ESXi hosts running n-nested ESXi hosts. All of them are managed from a single vCenter and should be part of a single transport zones where n-VXLANs (unfassbar viele) will be deployed.
When I came to the physical implementation of the logical design, it looked pretty similiar like it has been drawn in the following figure. The example shows 2 nested ESXi running on my physical ESX01, while another nested ESXi runs on ESX02. My transport VLAN 30 (for VXLAN) is configured on the physical switch and as a VLAN trunk on the distributed / NSX vSwitch of the physical hosts. That’s where our VXLAN-frame will flow between the nested and physical hosts. Of course the MTU size was increased all over the environment (end-to-end).
In theory everything should work fine with this setup….buuuuuttt well….it didn’t… and that’s where the funny part began. L2 connectivity between VMs on my VXLANs was not working as expected. Sometimes my virtual machines on a specific VXLAN could reach each other, sometime they couldn’t… that was not very reliable for a robust/reliable protocol like VXLAN is. So it was time to go through all the stuff we learned on the (NSX-) academy.
There are a lot of great resources to check / test / troubleshoot problems within the virtual / physical network, e.g. Roi Ben Haim’s great collection of useful L2 troubleshooting tools.
One thing that really confused me was that vmkping (ping with a specific VMkernel port, in this case the VTEP VMkernel port) worked fine for jumbo-frames in all constellations:
- Nested – nested (same ESXi)
- Nested – nested (different ESXi)
- Nested – physical
vmkping ++netstack=vxlan <vtepvmk IP> -d -s 8972
All relevant NSX tables (MAC, ARP, VTEP table) had valid data and the increased log-level of the netcpa showed me that relevant information has been exchanged between the ESXi and the NSX-Controller.
The problem must have been somewhere else in the network stack. Step by step I figured out in which constellation the VXLAN connectivity worked and when not. I abstracted the constellations into the following 2 scenarios
1. Scenario: 2 nested ESXi on a single physical ESXi
In this scenario every a VXLAN on every ESXi (nested and physical) was able to communicate with each other.
2. Scenario: 3 nested ESXi on a two physical ESXi
This constellation was where it was getting complicated (and to be honest is the scenario where it is important that it works). As soon as the VXLAN frame needed to flow out of the physical host connectivity wasn’t working.
After creating some test VMs in VXLANs on the type of hosts I checked for dropped packets in ESXTOP –> NOTHING. Luckily we have another networking troubleshooting tool included in our vSphere installation I haven’t used for quiet a while: pktcap-uw. This little tool helps us to monitor (and store it as a pcap-file for analyzing it in e.g. wireshark) ESXi traffic on very specific points within the network stack. In the end you can also monitor dropped packets via
Pktcap-uw –capture drop
Therefore in my environment I watched out for dropped packets within my transport VLAN 30 with
Pktcap-uw –capture drop –vmk vmk1 drop –ng –vlan 30
And received an interesting output.
“… Captured at Drop point, Drop Reason ‘VXLAN Module Drop’. Drop Function ‘OverlayWrapperUplinkOutputCB’ …”
So it seems that frames from the nested/virtual world have been dropped on the NSX vSwitch of the physical ESXi.
It seems that the ESXi is blocking frames that are carrying UDP segments with the port that is used for VXLAN (in VMware’s release: 8472, RFC: 4789). I am still not sure what the exact reason is. If I get more feedback I will add it here of course.
The only workaround I figured out (there are some others who made similar observations à should have found / been gone through this article a little earlier. Dmitri Kalintsev came to a similar conclusion) is to separate the NSX vSwitch (including VXLAN portgroups and the VTEP-VMkernel port) on my physical ESXi from another virtual switch that connects the nested ESXi with the transport VLAN.
So in case you want to integrate your physical ESXi cluster with your nested ones. Keep those specific dropping characteristics in mind. Especially in Intel NUC scenarios with only a single network adapter another workaround would be mandatory (please comment any suggestions or workaround to avoid the frame-drops).
Therefore…. Enjoy your NSX environment at home and bring the knowledge you gained into your organization to benefit from this really nice piece of technology