NSX Troubleshooting tips

ESXi Host Level Troubleshooting

1. How to verify that VIBs are successfully installed on the ESXi host:

Verify the NSX vibs are installed and correct version is on the ESXi host esxcli software vib list

(Will display the list of all the VIBs installed on the hosts and user can grep for vxlan and vsip VIBs)

esxcli software vib get –vibname esx-vxlan

esxcli software vib get –vibname esx-vsip

esxcli software vib get –vibname esx-vxlan

Verify VXLAN kernel module vdl2 is loaded on the ESXi host vmkload_mod –l | grep vdl2
Find the VDS name associated with this host’s VTEP. esxcli network vswitch dvs vmware vxlan list

If none of these commands return expected output, this is an indication of a problem and logs should be verified.

Relevant logs to be checked are:

/var/log/esxupdate.log

/var/log/vmkernel.log

Syslog collectors like LogInsight can be configured to send alerts/errors for certain messages detected in the logs.

Sample Output:

2. How to verify control-plane is up between the host and the controller per logical-switch.

Verify logical network information and controller-plane connection per logical-switch esxcli network vswitch dvs vmware vxlan network list –vds-name <VDS_Name>
Verify message bus TCP connection (vsfwd) esxcli network ip connection list | grep 5671
Verify controller TCP connection (netcpad) esxcli network ip connection list | grep 1234
Verify controller connection from host /etc/init.d/netcpad

<status/start/stop/restart>

Verify the firewall process running on the host /etc/init.d/vShield-stateful-firewall

<status/start/stop/restart>

If there are VMs present attached to a logical switch on this host, the host should have controller-connections in the output of this command (there should be one connection for each logical switch which has an attached VM running on this host).

Check if all the controller connections show “up” or “down”. If there is a down, it warrants more debugging and checking the logs on the host and/or logging into the controllers for further debugging.

Relevant logs to be checked are the netcpa and vsfwd communication channel logs:

/var/log/netcpad.log

/var/log/vsfwd.log

Leave a comment