UDLD and its functionality
Priyanka Shyam
CCDE (Written) | CWNA | Cisco SCOR | CISCO SD-WAN | Technical Writer | Influencer | Multitasker | Considerate | Empathic | Excellent Communicator | Helpful
Hello Everyone,
I hope everyone is safe and healthy.
Today I thought to discuss the UDLD, as we all are aware of this basic protocol. Last week I got a chance to work on an issue related to UDLD and I myself felt that I am so confused, later I read and replicated the same in order to avoid the confusion. :). So I thought to share the same with you all.
Lets first understand what is UDLD and why we use it.
If you ever used fiber cables you might have noticed that there is a different connector to transmit and receive traffic. This is two physical cables, one is to transmit data and the other is to receive data. These fiber cables are usually plugged into an SFP and then the SFP is inserted into the switch.
If one of the cables (transmit or receive) fails we’ll have a unidirectional link failure and this can cause spanning-tree loops. There are two protocols that can take care of this problem:
· LoopGuard
· UDLD ( Unidirectional Link Detection )
Loop guard and UDLD are two ways to protect your fiber cables from causing loops in the network. In short, loop guard is a spanning-tree optimization, and UDLD is a layer 1/2 protocol (unrelated to spanning-tree) that protects your upper layer protocols from causing loops in the network. We will only discuss UDLD in this article.
Unidirectional Link Detection (UDLD) is a Cisco proprietary layer 2 protocol used to determine the physical status of a link. The purpose of Unidirectional Link Detection (UDLD) is to detect and deter issues that arise from Unidirectional Links. UDLD helps to prevent forwarding loops and blackholing of traffic by identifying and acting on logical one-way links that would otherwise go undetected.
There are two different modes of UDLD.
Cisco Normal mode
Port state is marked as undetermined
Port behaves according to STP state
Cisco Aggressive Mode
UDLD attempts to re-establish the state of the port
· Port is put into the errdisable state if unable to re-establish port state
Port is actually disabled
Let's understand this through a diagram:
In my topology, the interface Gi0/1 is a fiber interface where I am using fiber cable having the TX/RX connector. In my diagram, it’s shown as Gi0/1 on Sw2 and Sw3.
So when the fiber to Sw2’s Rx port fails, and UDLD is in aggressive mode, the port is put into error-disabled. The way UDLD works out that there is a unidirectional link (i.e. just 1 part of the fiber is broke) is pretty cool.
Each switch sends out periodic Ethernet multicast UDLD hello’s destined to 0100.0ccc.cccd and lists its own device ID, port ID, time-out value, and a bunch of other parameters. (This I will be talking with an example below).When a switch receives this UDLD frame, it does two things; it stores and caches this information from the neighbor, and it echos the same device ID and Port ID it just received in the UDLD hello back towards originating switch. When the originating switch sees the UDLD frame come in with his own device ID and Port ID, it knows a UDLD neighbor exists out of the interface. These multicast hellos are used to build and maintain the neighbor relationship and are expected to be received before the time-out interval expires in order to keep the neighbor alive from a UDLD perspective.
So in my topology, when the fiber is broken on the Sw3 Tx port, UDLD identifies that we are no longer seeing a UDLD frame back in on the Gi0/1 interface (that would normally list Sw3‘s device-ID and port-ID), and when the UDLD time-out period expires, the switch transmits 8 UDLD frames, one per second, and if no reply is received then the port goes into err-disabled. This is the default action of aggressive mode. In normal mode, the port just goes into the unknown state, which is designed for an “informational purpose”. In reality, that’s just useless, so use aggressive mode to prevent loops.
The different parameters which we see in case of UDLD like own device ID, port ID, time-out value, and a bunch of other parameters.
I am taking the below topology to show the different parameters of the UDLD.
I have enabled the normal udld in global config mode for all the above switches.
F241.04.16-3850-1#conf t F241.04.16-3850-1(conf)#udld enable
If you want to use udld aggressive mode then use the below command:
F241.04.16-3850-1#conf t F241.04.16-3850-1(conf)#udld aggressive
Now I will be executing show udld neighbors on all the switches. So from below output, you can see that on my switch 1 I am able to see my neighboring port ID which is nothing but my port on switch 2 that is Te1/0/1 and Te1/0/2
Switch 1
F241.04.16-3850-1#show udld neighbors Port Device Name Device ID Port ID Neighbor State ---- ----------- --------- ------- -------------- Te1/0/1 D4ADBDB49D80 1 Te1/0/1 Bidirectional Te1/0/2 D4ADBDB49D80 1 Te1/0/2 Bidirectional
On switch 2 we can see the neighboring port id that is Te1/0/1-2 for switch 1 and switch 3
F241.04.16-3850-2#show udld neighbors Port Device Name Device ID Port ID Neighbor State ---- ----------- --------- ------- -------------- Te1/0/1 C064E43A1880 1 Te1/0/1 Bidirectional Te1/0/2 C064E43A1880 1 Te1/0/2 Bidirectional Te1/0/3 C064E41FD60 1 Te1/0/1 Bidirectional
Te1/0/4 C064E41FD60 1 Te1/0/2 Bidirectional
Switch 3
F241.04.16-3850-3#show udld neighbors Port Device Name Device ID Port ID Neighbor State ---- ----------- --------- ------- -------------- Te1/0/1 D4ADBDB49D80 1 Te1/0/3 Bidirectional Te1/0/2 D4ADBDB49D80 1 Te1/0/4 Bidirectional
Also, we can see the detail like device ID, port ID, time-out value, and a bunch of other parameters using the below command.
F241.04.16-3850-3#show udld tenGigabitEthernet 1/0/1 Interface Te1/0/1 --- Port enable administrative configuration setting: Follows device default Port enable operational state: Enabled Current bidirectional state: Bidirectional Current operational state: Advertisement - Single neighbor detected Message interval: 15000 ms Time out interval: 5000 ms Port fast-hello configuration setting: Disabled Port fast-hello interval: 0 ms Port fast-hello operational state: Disabled Neighbor fast-hello configuration setting: Disabled Neighbor fast-hello interval: Unknown Entry 1 --- Expiration time: 36400 ms Cache Device index: 1 Current neighbor state: Bidirectional Device ID: D4ADBDB49D80 Port ID: Te1/0/3 Neighbor echo 1 device: C064E41FD60 Neighbor echo 1 port: Te1/0/1 TLV Message interval: 15 sec No TLV fast-hello interval TLV Time out interval: 5 TLV CDP Device name: F241.04.16-3850-2.cisco.com
Now the point which so many people are not aware of and its hidden is that: There are two methods of detecting a unidirectional link: explicitly and implicitly. Explicit detection will result in the disabling of the port regardless of the mode of operation (whether its normal mode or aggressive mode). So the difference between the normal and aggressive modes relates to the difference in handling an implicit uni-directional link event. If a unidirectional link is detected explicitly, the port will always be err-disabled, regardless of the normal/aggressive mode
All of the below is regardless of the mode of operation. This corresponds to the explicit method of detection.
If the port does not see its own device/port ID in the incoming UDLD packets for a specific duration of time, the link is considered unidirectional.
This echo-algorithm allows detection of these issues:
? Link is up on both sides, however, packets are only received by one side.
? Wiring mistakes when receive and transmit fibers are not connected to the same port on the remote side.
Once the unidirectional link is detected by UDLD, the respective port is disabled and this message is printed on the console:
UDLD-3-DISABLE: Unidirectional link detected on port 1/2. Port disabled
Port shutdown by UDLD remains disabled until it is manually re-enabled, or until errdisable timeout expires (if configured).
As discussed above UDLD can operate in two modes: normal and aggressive.
In normal mode, if the link state of the port was determined to be bi-directional and the UDLD information times out, no action is taken by UDLD. The port state for UDLD is marked as undetermined. The port behaves according to its STP state.
In aggressive mode, if the link state of the port is determined to be bi-directional and the UDLD information times out while the link on the port is still up, UDLD tries to re-establish the state of the port. If not successful, the port is put into the errdisable state.
The below statement only refers to a case where there are a timeout and UDLD info is not received. This describes the implicit detection, and this is where the mode of operation has meaning.
Aging of UDLD information happens when the port that runs UDLD does not receive UDLD packets from the neighbor port for the duration of hold time. The hold time for the port is dictated by the remote port and depends on the message interval at the remote side. The shorter the message interval, the shorter the hold time and the faster the detection. Recent implementations of UDLD allow configuration of message interval.
UDLD Error Conditions
The UDLD error conditions exist when the switch does not receive the expected information from its UDLD peer.
For this case, I will be using below topology
UDLD error conditions
· Empty-echo
· Transmit-Receive (Tx-Rx) Loop
· Uni-direction
· Neighbor mismatch
· Sudden cessation of UDLD frames
Empty Echo
This condition is present when Switch-A receives a UDLD frame from Switch-B without the expected echo of the Switch-A switch-ID and port-ID.
ETHPORT-2-IF_DOWN_ERROR_DISABLED Interface Ethernet1/2
is down (Error disabled. Reason: UDLD empty echo)
When an empty-echo is detected, the UDLD performs these actions:
Normal Mode err-disbale port
Aggressive mode err-disbale port
Here are some possible causes for this condition
· The UDLD bi-directional relationship has timed out on Switch-B because it does not receive the UDLD frames from Switch-A.
· Switch-B received the UDLD frames from Switch-A but did not process them.
· Switch-A did not send the UDLD frames to Switch-B.
Tx-Rx Loop
This condition occurs when a UDLD frame is received on the same port from which it was transmitted.
When a Tx-Rx loop is detected, UDLD performs these actions:
Normal Mode err-disbale port
Aggressive mode err-disbale port
These Syslog messages are then generated:
%ETHPORT-2-IF_DOWN_ERROR_DISABLED: Interface Ethernet17/5
is down (Error disabled. Reason:UDLD Tx-Rx Loop)
Here are some possible causes for this condition:
· There might be incorrect wiring or a physical media issue.
· The intermediate devices reflect the frames back to the sending port.
Neighbor Mismatch
This condition is present when Port-A on Switch-A receives a frame from a port other than that with which it already formed a UDLD bi-directional relationship.
When a neighbor mismatch is detected, UDLD performs these actions:
Normal Mode err-disbale port
Aggressive mode err-disbale port
These syslog messages are then generated:
%ETHPORT-2-IF_DOWN_ERROR_DISABLED: Interface Ethernet3/21
is down (Error disabled. Reason:UDLD Neighbor mismatch)
Here are some possible causes for this condition:
· The UDLD port in question is a member of a port-channel on which a member port has changed states.
· There is an intermediate device between the two ports that formed the bi-directional relationship.
Sudden Cessation of UDLD Frames
This condition is present when a port that has formed a bi-directional relationship does not receive a UDLD frame during the time-out interval (50 seconds by default).
When this condition is detected, the UDLD performs these actions:
Normal Mode. UDLD marks port as Undetermined, and the port continues to function in accordance with its spanning-tree port state
Aggressive mode. err-disbale port
Troubleshoot UDLD Error Conditions
If you encounter a UDLD error-disabled port then you can perform the below steps:
Since UDLD errors indicate physical layer faults, it is appropriate to troubleshoot at the physical layer. When UDLD error messages are encountered, consider these questions:
· Does the error persist if the Small Form-Factor Pluggable Transceiver (SFP) is replaced?
· Does the error persist if the cable is replaced?
· Does the error persist if the connection is moved to a different physical port on the switch?
Difference between UDLD and Loop guard
The key differences between UDLD and loop guard then, is that UDLD protects against mis-wiring of your fiber ports, or a physical wiring problem that would cause your upper layer protocols like spanning-tree to break. Note though, that UDLD is not a part of spanning-tree, nor does it play any part in a spanning-tree topology. It is merely there as a helper for spanning-tree because spanning-tree is unable to identify a fault at layer 1 like this that would cause a loop in the network. Now loop guard is a spanning-tree optimization and its function is to stop root or ALTN ports transitioning into the designated/forwarding state. A lot of the time loop guard is going to kick in when there is a physical layer problem but it can also protect against some spanning-tree stupidity or bad configured ACLs. For example let’s say someone accidentally went to the gi0/1 interface on Sw2 and configured #spanning-tree bpdufilter enable. The port would neither send or receive BPDU’s, and it would become designated and cause a loop. If loop guard was pre-configured on the port, it would just go into loop inconsistent state and be blocked. UDLD would be non the wiser, but loop guard would see this problem.
The recommended best practice is to use both UDLD and loop guard together. It’s also recommended to make sure that you tune your UDLD timers to detect a layer 1 problem faster than spanning-tree can transition a port into a designated/forwarding state.