Paramveer: Troubleshooting HSRP

Troubleshoot HSRP Case Studies

Case Study #1: HSRP Standby IP Address Is Reported as a Duplicate IP Address

These error messages can appear:


Oct 12 13:15:41: %STANDBY-3-DUPADDR: Duplicate address 10.25.0.1 
  on Vlan25, sourced by 0000.0c07.ac19 

  on Vlan25, sourced by 0000.0c07.ac19 
Oct 13 16:25:41: %STANDBY-3-DUPADDR: Duplicate address 10.25.0.1 

Oct 15 22:41:01: %STANDBY-3-DUPADDR: Duplicate address 10.25.0.1 
Oct 15 22:31:02: %STANDBY-3-DUPADDR: Duplicate address 10.25.0.1 
  on Vlan25, sourced by 0000.0c07.ac19 

  on Vlan25, sourced by 0000.0c07.ac19

These error messages do not necessarily indicate an HSRP problem. Rather, the error messages indicate a possible Spanning Tree Protocol (STP) loop or router/switch configuration issue. The error messages are just symptoms of another problem.

In addition, these error messages do not prevent the proper operation of HSRP. The duplicate HSRP packet is ignored. These error messages are throttled at 30-second intervals. But, slow network performance and packet loss can result from the network instability that causes the STANDBY-3-DUPADDR error messages of the HSRP address.

These error messages can appear:


Oct 15 22:41:01: %STANDBY-3-DUPADDR: Duplicate address 10.25.0.1 

  on Vlan25, sourced by 0000.0c07.ac19

These messages specifically indicate that the router received a data packet that was sourced from the HSRP IP address on VLAN 25 with the MAC addresses 0000.0c07.ac19. Since the HSRP MAC address is 0000.0c07.ac19, either the router in question received its own packet back or both routers in the HSRP group went into the active state. Because the router received its own packet, the problem most likely is with the network rather than the router. A variety of problems can cause this behavior. Among the possible network problems that cause the error messages are:

Momentary STP loops
EtherChannel configuration issues
Duplicated frames

When you troubleshoot these error messages, see the troubleshooting steps in the HSRP section of this document. All the troubleshooting modules are applicable to this section, which includes modules on configuration. In addition, note any errors in the switch log and reference additional case studies as necessary.

You can use an access list in order to prevent the active router from receiving its own multicast hello packet. But, this is only a workaround for the error messages and actually hides the symptom of the problem. The workaround is to apply an extended inbound access list to the HSRP interfaces. The access list blocks all traffic that is sourced from the physical IP address and that is destined to all routers multicast address 224.0.0.2.


access-list 101 deny ip host 172.16.12.3 host 224.0.0.2 
access-list 101 permit ip any any 
  

  ip address 172.16.12.3 255.255.255.0 
interface ethernet 0 
  standby 1 ip 172.16.12.1 

  ip access-group 101 in

Case Study #2: HSRP State Continuously Changes (Active, Standby, Speak) or %HSRP-6-STATECHANGE

These error messages can appear:


Jan 9 08:00:42.623: %STANDBY-6-STATECHANGE: Standby: 49: 
  Vlan149 state Standby -> Active

  Vlan149 state Active -> Speak
Jan 9 08:00:56.011: %STANDBY-6-STATECHANGE: Standby: 49: 

Jan 9 08:01:29.427: %STANDBY-6-STATECHANGE: Standby: 49: 
Jan 9 08:01:03.011: %STANDBY-6-STATECHANGE: Standby: 49: 
  Vlan149 state Speak -> Standby

  Vlan149 state Active -> Speak
Vlan149 state Standby -> Active
Jan 9 08:01:36.808: %STANDBY-6-STATECHANGE: Standby: 49: 

  Vlan149 state Speak -> Standby

Jan 9 08:01:43.808: %STANDBY-6-STATECHANGE: Standby: 49:

These error messages describe a situation in which a standby HSRP router did not receive three successive HSRP hello packets from its HSRP peer. The output shows that the standby router moves from the standby state to the active state. Shortly thereafter, the router returns to the standby state. Unless this error message occurs during the initial installation, an HSRP issue probably does not cause the error message. The error messages signify the loss of HSRP hellos between the peers. When you troubleshoot this issue, you must verify the communication between the HSRP peers. A random, momentary loss of data communication between the peers is the most common problem that results in these messages. HSRP state changes are often due to High CPU Utilization. If the error message is due to high CPU utilization, put a sniffer on the network and the trace the system that causes the high CPU utilization.

There are several possible causes for the loss of HSRP packets between the peers. The most common problems are physical layer problems, excessive network traffic caused by spanning tree issues or excessive traffic caused by each Vlan. As with Case Study #1, all the troubleshooting modules are applicable to the resolution of HSRP state changes, particularly the Layer 3 HSRP Debugging.

If the loss of HSRP packets between peers is due to excessive traffic caused by each VLAN as mentioned, you can tune or increase the SPD and hold the queue size to overcome the input queue drop problem.

In order to increase the Selective Packet Discard (SPD) size, go to the configuration mode and execute these commands on the Cat6500 switches:


(config)# ip spd queue max-threshold 600




!--- Hidden Command



(config)# ip spd queue min-threshold 500



!--- Hidden Command

In order to increase the hold queue size, go to the VLAN interface mode and execute this command.:

(config-if)# hold-queue 500 in

After you increase the SPD and hold queue size, you can clear the interface counters if you execute the 'clear counter interface'command.

Case Study #3: HSRP Does Not Recognize Peer

The router output in this section shows a router that is configured for HSRP but does not recognize its HSRP peers. In order for this to occur, the router must fail to receive HSRP hellos from the neighbor router. When you troubleshoot this issue, see the Verify Physical Layer Connectivity section and the Verify HSRP Router Configuration section of this document. If the physical layer connectivity is correct, check for the mismatched VTP modes.


Vlan8 - Group 8

Local state is Active, priority 110, may preempt
Hellotime 3 holdtime 10

Hot standby IP address is 10.1.2.2 configured
Next hello sent in 00:00:01.168
Active router is local


Standby router is unknown expired

Standby virtual mac address is 0000.0c07.ac08

5 state changes, last state change 00:05:03

Case Study #4: HSRP State Changes and Switch Reports SYS-4-P2_WARN: 1/Host <mac_address> Is Flapping Between Port <port_1> and Port <port_2> in Syslog

These error messages can appear:


2001 Jan 03 14:18:43 %SYS-4-P2_WARN: 1/Host 00:00:0c:14:9d:08 

  is flapping between port 2/4 and port 2/3

In software version 5.5.2 and later for the Catalyst 4500/4000 and 2948G, the switch reports a host MAC address that moves if the host MAC address moves twice within 15 seconds. A common cause is an STP loop. The switch discards packets from this host for about 15 seconds in an effort to minimize the impact of an STP loop. If the MAC address move between two ports that is reported is the HSRP virtual MAC address, the problem is most likely an issue in which both HSRP routers go into the active state.

If the MAC address that is reported is not the HSRP virtual MAC address, the issue can indicate the loop, duplication, or reflection of packets in the network. These types of conditions can contribute to HSRP problems. The most common causes for the move of MAC addresses are spanning tree problems or physical layer problems.

When you troubleshoot this error message, complete these steps:

Determine the correct source (port) of the MAC address that the error message reports.
Disconnect the port that must not source the host MAC address and check for HSRP stability.
Document the STP topology on each VLAN and check for STP failure.
Verify the port channel configuration.

An incorrect port channel configuration can result in the flap of error messages by the host MAC address. This is because of the load-balancing nature of port channeling.

Case Study #5: HSRP State Changes and Switch Reports RTD-1-ADDR_FLAP in Syslog

These error messages can appear:


*Mar 9 14:51:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 
  relearning 21 addrs per min 

  relearning 22 addrs per min 
*Mar 9 14:52:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 

*Mar 9 14:54:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 
*Mar 9 14:53:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 
  relearning 20 addrs per min 

  relearning 21 addrs per min 
relearning 20 addrs per min 
*Mar 9 14:55:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 

*Mar 9 14:57:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 
*Mar 9 14:56:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 
  relearning 22 addrs per min 

  relearning 21 addrs per min

These error message signify that a MAC address moves consistently between different ports. These error messages are only applicable on the Catalyst 2900XL and 3500XL switches. The messages can indicate that two or more HSRP routers have becomeactive. The messages can indicate the source of an STP loop, duplicated frames, or reflected packets.

In order to gather more information about the error messages, issue this debug command:


switch#debug ethernet-controller address


Ethernet Controller Addresses debugging is on l

*Mar 9 08:06:06: Add address 0000.0c07.ac02, on port 35 vlan 2 

*Mar 9 08:06:06: 0000.0c07.ac02 has moved from port 6 to port 35 in vlan 2  
*Mar 9 08:06:07: Add address 0000.0c07.ac02, on port 6 vlan 2 

*Mar 9 08:06:08: Add address 0000.0c07.ac02, on port 35 vlan 2 
*Mar 9 08:06:07: 0000.0c07.ac02 has moved from port 35 to port 6 in vlan 2 

*Mar 9 08:06:10: 0000.0c07.ac02 has moved from port 35 to port 6 in vlan 2 
*Mar 9 08:06:08: 0000.0c07.ac02 has moved from port 6 to port 35 in vlan 2 
*Mar 9 08:06:10: Add address 0000.0c07.ac02, on port 6 vlan 2 

*Mar 9 08:06:12: %RTD-1-ADDR_FLAP: Fast Ethernet 0/7 relearning 20 addrs per min 
*Mar 9 08:06:11: Add address 0000.0c07.ac02, on port 35 vlan 2 
*Mar 9 08:06:11: 0000.0c07.ac02 has moved from port 6 to port 35 in vlan 2  

*Mar 9 08:06:13: 0000.0c07.ac02 has moved from port 35 to port 6 in vlan 2 

*Mar 9 08:06:13: Add address 0000.0c07.ac02, on port 6 vlan 2

The ports that the debug command references are off by one. For example, port 0 is Fast Ethernet 0/1. The error messages indicate the flap of a MAC address between ports 5 and 34 on the respective switch.

The most common causes for the move of MAC addresses are spanning tree problems or physical layer problems.

When you troubleshoot this error message, complete these steps:

Determine the correct source (port) of the host MAC address.
Disconnect the port that should not source the host MAC address.
Document the STP topology on a per-VLAN basis and check for STP failure.
Verify the port channeling configuration.

An incorrect port channel configuration can result in the flap of error messages by the host MAC address. This is because of the load-balancing nature of port channeling.

Case Study #6: HSRP State Changes and Switch Reports MLS-4-MOVEOVERFLOW:Too many moves, stop MLS for 5 sec(20000000) in Syslog

These error messages can appear:


05/13/2000,08:55:10:MLS-4-MOVEOVERFLOW:Too many moves, stop MLS for 5 sec(20000000) 

05/13/2000,08:55:15:MLS-4:Resume MLS after detecting too many moves

These messages indicate that the switch learns the same MAC address on two different ports. This message is only reported on Catalyst 5500/5000 switches. Issue these commands in order to gather additional information about the problem:

Note: The commands that this section mentions are not documented. You must enter them completely. The show mls notificationcommand provides a table address (TA) value. The show looktable TA-value command returns a possible MAC address that you can trace to the root of the problem.


Switch (enable) show mls notification 


1: (0004e8e6-000202ce) Noti Chg TA e8e6 OI 2ce (12/15) V 1 



!--- This is the mod/port and VLAN. The MAC address is 

!--- seen on this module 12, port 15 in VLAN 1.



2: (0004e8e6-000202cd) Noti Chg TA e8e6 OI 2cd (12/14) V 1



!--- This is the mod/port and VLAN. The next is seen on 

!--- module 12, port 14 in VLAN 1.

Write down the four-digit/letter combination that appears after Chg TA in this command output. The show looktable command gives the MAC address that causes the MLS TOO MANY MOVES error message:


150S_CR(S2)> (enable) show looktable e8e6


Table address: 0xe8e6, Hash: 0x1d1c, Page: 6 

Entry Data[3-0]: 0x000002cd 0x00800108 0x0008c790 0x215d0005, Entry Map [00] 
Router-Xtag QOS SwGrp3 Port-Index 
0 0 0x0 0x2cd 


0 0x01 0x0000 0x0000 0 0 0 0 0 0 0 
Fab AgeByte C-Mask L-Mask Static SwSc HwSc EnSc AL Trap R-Mac 


Entry-Mac-Address FID SwGrp1 Parity1 
MacAge Pri-In Modify Notify IPX-Sw IPX-Hw IPX-En Valid SwGrp2 Parity2 
0 0 1 0 0 0 0 1 0x0 0 



00-08-c7-90-21-5d 1 0x0 1

The entry MAC address 00-08-c7-90-21-5d is the MAC address that flaps between ports. You must know the MAC address in order to find the offending device. If the entry MAC address is the virtual HSRP MAC address, the issue can be that both HSRP routers have gone into the active state.

The most common causes for the move of MAC addresses are spanning tree problems or physical layer problems.

When you troubleshoot this error message, complete these steps:

Determine the correct source (port) of the host MAC address.
Disconnect the port that should not source the host MAC address.
Document the STP topology on a per-VLAN basis and check for STP failure.
Verify the port channeling configuration.

An incorrect port channel configuration can result in the flap of error messages by the host MAC address. This is because of the load-balancing nature of port channeling.
Disable PortFast on all of the ports that connect to devices other than a PC or IP phone in order to avoid bridging loops.

Case Study #7: HSRP Intermittent State Changes on Multicast Stub Network

There is a common cause for HSRP anomalous state changes for an HSRP router that is part of a multicast stub network. This common cause deals with the non-Reverse Path Forwarding (RPF) traffic that the non-designated router (DR) sees. This is the router that does not forward the multicast traffic stream.

IP multicast uses one router to forward data onto a LAN in redundant topologies. If multiple routers have interfaces onto a LAN or VLAN, only one router forwards the data. There is no load balancing for multicast traffic on LANs. All multicast traffic is always visible by every router on a LAN. This is also the case if Cisco Group Management Protocol (CGMP) or Internet Group Management Protocol (IGMP) snooping is configured. Both routers need to see the multicast traffic in order to make a forwarding decision.

This diagram provides an example. The red lines indicate multicast feed.

The redundant router, which is the router that does not forward the multicast traffic stream, sees this data on the outbound interface for the LAN. The redundant router must drop this traffic because the traffic arrived on the wrong interface and, therefore, fails the RPF check. This traffic is referred to as non-RPF traffic because it is reflected backward against the flow from the source. For this non-RPF traffic, there is usually no (*,G) or (S,G) state in the redundant router. Therefore, no hardware or software shortcuts can be created in order to drop the packet. The processor must examine each multicast packet individually. This requirement can cause the CPU on these routers to spike or run at a very high processing rate. Often, a high rate of multicast traffic on the redundant router causes HSRP to lose hello packets from its peer and change states.

Therefore, enable hardware access lists on Catalyst 6500 and 8500 routers that do not handle non-RPF traffic efficiently by default. The access lists prevent the CPU from processing the non-RPF traffic.

Do not attempt to work around this problem with a disablement of the IP Protocol Independent Multicast (PIM) on the redundant router interfaces. This configuration can have an undesirable impact on the redundant router.

On the 6500/8500 routers, there is an access list engine that enables filtering to take place at wire rate. You can use this feature to handle non-RPF traffic for sparse mode groups efficiently.

In software versions 6.2.1 and later, the system software automatically enables filtering so that the non-DR does not receive unnecessary non-RPF traffic. In earlier software versions, you need to configure access lists manually. In order to implement this solution for software versions that are earlier than 6.2.1, place an access list on the inbound interface of the stub network. The access list filters multicast traffic that did not originate from the stub network. The access list is pushed down to the hardware in the switch. This access list ensures that the CPU never sees the packet and allows the hardware to drop the non-RPF traffic.

For example, assume that you have two routers with two VLANs in common. You can expand this number of VLANs to as many VLANs as necessary. Router A is HSRP primary for VLAN 1 and secondary for VLAN 2. Router B is secondary for VLAN 1 and primary for VLAN 2. Give either Router A or Router B a higher IP address in order to make that router the DR. Be sure that only one router is the DR for all segments, as this example shows:


Router A

     VLAN1 Physical IP Address
A.B.C.3

Router B

     A.B.C.2
VLAN1 Physical IP Address

     VLAN2 Physical IP Address
VLAN1 HSRP Address
     A.B.C.1

Router A

     A.B.D.2
A.B.D.3

Router B
     VLAN2 Physical IP Address

     A.B.D.1

     VLAN2 HSRP Address

Place this access list on the non-DR router:


access-list 100 permit ip A.B.C.0 0.0.0.255 any
access-list 100 permit ip A.B.D.0 0.0.0.255 any

access-list 100 permit ip any 224.0.0.0 0.0.0.255
access-list 100 permit ip any 224.0.1.0 0.0.0.255

access-list 100 deny ip any 224.0.0.0 15.255.255.255

You should have one permit for each subnet that the two routers share. Other permits allow auto-rendezvous point (RP) and reserved groups to operate correctly.

Issue these additional commands in order to apply the access control lists (ACLs) to each VLAN interface on the non-DR:

ip access-group 100 in
no ip redirects
no ip unreachables

You must run Catalyst software 5.4(3) or later in order for the ACLs to work in hybrid configuration.

The redundant router designs that this document discusses are externally redundant, which means that there are two physical 6500 routers. Do not use this workaround for internal redundancy, in which two route processors are in one box.

Case Study #8: Asymmetric Routing and HSRP (Excessive Flooding of Unicast Traffic in Network with Routers That Run HSRP)

With asymmetric routing, transmit and receive packets follow different paths between a host and the peer with which it communicates. This packet flow is a result of the configuration of load balancing between HSRP routers, based on HSRP priority, which set the HSRP to active or standby. This type of packet flow in a switching environment can result in excessive unknown unicast flooding. Also, Multilayer Switching (MLS) entries can be absent. Unknown unicast flooding occurs when the switch floods a unicast packet out of all ports. The switch floods the packet because there is no entry for the destination MAC address. This behavior does not break connectivity because packets are still forwarded. But, the behavior does account for the flood of extra packets on host ports. This case studies the behavior of asymmetric routing and why unicast flooding results.

Symptoms of asymmetric routing include:

Excessive unicast packet flooding
Absent MLS entry for flows
Sniffer trace that shows that packets on the host port are not destined for the host
Increased network latency with L2-based packet rewrite engines, such as server load balancers, web cache devices, and network appliances

Examples include the Cisco LocalDirector and Cisco Cache Engine.
Dropped packets on connected hosts and workstations that cannot handle the additional unicast-flooding traffic load

The default ARP cache aging time on a router is four hours. The default aging time of the switch content-addressable memory (CAM) entry is five minutes. The ARP aging time of the host workstations is not significant for this discussion. but, the example sets the ARP aging time to four hours.

This diagram illustrates this issue. This topology example includes Catalyst 6500s with Multilayer Switch Feature Cards (MSFCs) in each switch. Although this example uses MSFCs, you can use any router instead of the MSFC. Example routers that you can use include the Route Switch Module (RSM), Gigabit Switch Router (GSR), and Cisco 7500. The hosts are directly connected to ports on the switch. The switches are interconnected through a trunk which carries traffic for VLAN 1 and VLAN 2.

These outputs are excerpts from the show standby command configuration from each MSFC:

MSFC1


interface Vlan 1 

   mac-address 0003.6bf1.2a01 

    ip address 10.1.1.2 255.255.255.0 
no ip redirects 

    standby 1 priority 110 
standby 1 ip 10.1.1.1 
 
interface Vlan 2 

    ip address 10.1.2.2 255.255.255.0 
mac-address 0003.6bf1.2a01 
    no ip redirects 

MSFC1#show standby
standby 2 ip 10.1.2.1 
 


Vlan1 - Group 1

Local state is Active, priority 110

Hellotime 3 holdtime 10

Next hello sent in 00:00:00.696

Hot standby IP address is 10.1.1.1 configured
Active router is local

Standby router is 10.1.1.3 expires in 00:00:07
Standby virtual mac address is 0000.0c07.ac01

Vlan2 - Group 2
2 state changes, last state change 00:20:40
Local state is Standby, priority 100

Hellotime 3 holdtime 10
Next hello sent in 00:00:00.776

Hot standby IP address is 10.1.2.1 configured
Active router is 10.1.2.3 expires in 00:00:09, priority 110

MSFC1#exit
Standby router is local
4 state changes, last state change 00:00:51


Console> (enable)

MSFC2


interface Vlan 1 

    mac-address 0003.6bf1.2a02 

    ip address 10.1.1.3 255.255.255.0 
no ip redirects 

interface Vlan 2 
standby 1 ip 10.1.1.1 
    

    ip address 10.1.2.3 255.255.255.0 
mac-address 0003.6bf1.2a02 
    no ip redirects 

MSFC2#show standby
standby 2 ip 10.1.2.1 
    standby 2 priority 110  
  


Vlan1 - Group 1

Local state is Standby, priority 100

Hellotime 3 holdtime 10

Next hello sent in 00:00:01.242

Hot standby IP address is 10.1.1.1 configured

Active router is 10.1.1.2 expires in 00:00:09, priority 110
Standby router is local

Vlan2 - Group 2
7 state changes, last state change 00:01:17 
Local state is Active, priority 110

Hellotime 3 holdtime 10
Next hello sent in 00:00:00.924

Hot standby IP address is 10.1.2.1 configured
Active router is local
Standby router is 10.1.2.2 expires in 00:00:09

2 state changes, last state change 00:40:08
Standby virtual mac address is 0000.0c07.ac02

MSFC2#exit

On MSFC1, VLAN 1 is in the HSRP active state, and VLAN 2 is in the HSRP standby state. On MSFC2, VLAN 2 is in the HSRP active state, and VLAN 1 is in the HSRP standby state. The default gateway of each host is the respective standby IP address.

Initially, all caches are empty. Host A uses MSFC1 as its default gateway. Host B uses MSFC2.

ARP and MAC Address Tables Before Ping Is Initiated

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
	0003.6bf1.2a01 1 15/1			0003.6bf1.2a02 1 15/1
	0003.6bf1.2a01 2 15/1			0003.6bf1.2a02 2 15/1
	0000.0c07.ac01 1 15/1			0000.0c07.ac01 1 1/1
	0000.0c07.ac02 2 1/1			0000.0c07.ac02 2 15/1
	0003.6bf1.2a02 1 1/1			0003.6bf1.2a01 1 1/1
	0003.6bf1.2a02 2 1/1			0003.6bf1.2a01 2 1/1

Host A pings host B, which means that host A sends an ICMP echo packet. Because each host resides on a separate VLAN, host A forwards its packets that are destined for host B to its default gateway. In order for that process to occur, host A must send an ARP in order to resolve its default gateway MAC address, 10.1.1.1.

ARP and MAC Address Tables After Host A Sends ARP for Default Gateway

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
10.1.1.1 : 0000.0c07.ac01	0000.0c00.0001 1 2/1	10.1.1.10 : 0000.0c00.0001

MSFC1 receives the packet, rewrites the packet, and forwards the packet to host B. In order to rewrite the packet, MSFC1 sends an ARP request for host B because the host resides off a directly connected interface. MSFC2 has yet to receive any packets in this flow. When MSFC1 receives the ARP reply from host B, both switches learn the source port that is associated with host B.

ARP and MAC Address Tables After Host A Sends Packet to Default Gateway and MSFC1 Sends ARP for Host B

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
10.1.1.1 : 0000.0c07.ac01	0000.0c00.0001 1 2/1	10.1.1.10 : 0000.0c00.0001		0000.0c00.0002 2 2/1	10.1.2.2 : 0003.6bf1.2a01
	0000.0c00.0002 2 1/1	10.1.2.10 : 0000.0c00.0002

Host B receives the echo packet from host A, through MSFC1. Host B must now send an echo reply to host A. Since host A resides on a different VLAN, host B forwards the reply through its default gateway, MSFC2. In order to forward the packet throughMSFC2, host B must send an ARP for its default gateway IP address, 10.1.2.1.

ARP and MAC Address Tables After Host B Sends ARP for Its Default Gateway

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
10.1.1.1 : 0000.0c07.ac01	0000.0c00.0001 1 2/1	10.1.1.10 : 0000.0c00.0001	10.1.2.10 0000.0c00.0002	0000.0c00.0002 2 2/1	10.1.2.2 (0003.6bf1.2a01)
	0000.0c00.0002 2 1/1	10.1.2.10 : 0000.0c00.0001			10.1.2.1 (0000.0c07.ac02)

Host B now forwards the echo reply packet to MSFC2. MSFC2 sends an ARP request for host A because it is directly connected on VLAN 1. Switch 2 populates its MAC address table with the MAC address of host B.

ARP and MAC Address Tables After Echo Packet Has Been Received by Host A

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
10.1.1.1 : 0000.0c07.ac01	0000.0c00.0001 1 2/1	10.1.1.10 : 0000.0c00.0001	10.1.2.10 0000.0c00.0002	0000.0c00.0002 2 2/1	10.1.2.2 ( 0003.6bf1.2a01)
10.1.1.3 : 0003.6bf1.2a0	0000.0c00.0002 2 1/1	10.1.2.10 : 0000.0c00.0001	10.1.1.10 0000.0c00.0001	0000.0c00.00001 1 1/1	10.1.2.1 (0000.0c07.ac02)

The echo reply reaches host A and the flow is complete.

Consequences of Asymmetric Routing

Consider the case of the continuous ping of host B by host A. Remember that host A sends the echo packet to MSFC1, and host B sends the echo reply to MSFC2, which is in an asymmetric routing state. The only time that Switch 1 learns the source MAC of host B is when host B replies to an ARP request from MSFC1. This is because host B uses MSFC2 as its default gateway and does not send packets to MSFC1 and, consequently, Switch 1. Since the ARP timeout is four hours by default, Switch 1 ages the MAC address of host B after five minutes by default. Switch 2 ages host A after fiveminutes. As a result, Switch 1 must treat any packet with a destination MAC of host B as an unknown unicast. The switch floods the packet that comes from host A and is destined for host B out all ports. In addition, because there is no MAC address entry host B in Switch 1, there is no MLS entry as well.

ARP and MAC Address Tables After 5 Minutes of Continuous Ping of Host B by Host A

Host A ARP Table	Switch 1 MAC Address Table MAC VLAN Port	MSFC1 ARP Table	MSFC2 ARP Table	Switch 2 MAC Address Table MAC VLAN Port	Host B ARP Table
10.1.1.1 : 0000.0c07.ac01	0000.0c00.0001 1 2/1	10.1.1.10 : 0000.0c00.0001	10.1.2.10 0000.0c00.0002	0000.0c00.0002 2 2/1	10.1.2.2 : 0003.6bf1.2a01
10.1.1.3 : 0003.6bf1.2a0		10.1.2.10 : 0000.0c00.0001	10.1.1.10 0000.0c00.0001		10.1.2.1 : 0000.0c07.ac01

The echo reply packets that come from host B experience the same issue after the MAC address entry for host A ages on Switch 2. Host B forwards the echo reply to MSFC2, which in turn routes the packet and sends it out on VLAN 1. The switch does not have an entry host A in the MAC address table and must flood the packet out all ports in VLAN 1.

Asymmetric routing issues do not break connectivity. But, asymmetric routing can cause excessive unicast flooding and MLS entries that are missing. There are three configuration changes that can remedy this situation:

Adjust the MAC aging time on the respective switches to 14,400 seconds (four hours) or longer.
Change the ARP timeout on the routers to five minutes (300 seconds).
Change the MAC aging time and ARP timeout to the same timeout value.

The preferable method is to change the MAC aging time to 14,400 seconds. These are the configuration guidelines:

CatOS:
set cam agingtime vlan_aging_time_in_msec
Cisco IOS Software/2900XL/3500XL:
mac-address-table aging-time seconds [vlan vlan_id]

Case Study #9: HSRP Virtual IP Address Is Reported as a Different IP Address

The STANDBY-3-DIFFVIP1 error message occurs when there is interVLAN leakage because of bridging loops in the switch.

If you get this error message and there is interVLAN leakage because of bridging loops in the switch, complete these steps in order to resolve the error:

Identify the path that the packets should take between end nodes.

If there is a router on this path, complete these steps:
1. Troubleshoot the path from the first switch to the router.
2. Troubleshoot the path from the router to the second switch.
Connect to each switch on the path and check the status of the ports that are used on the path between end nodes.

Case Study #10: HSRP Causes MAC Violation on a Secure Port

When port security is configured on the switch ports that are connected to the HSRP enabled routers, it causes a MAC violation, since you cannot have the same secure MAC address on more than one interface. A security violation occurs on a secure port in one of these situations:

The maximum number of secure MAC addresses is added to the address table, and a station whose MAC address is not in the address table attempts to access the interface.
An address that is learned or configured on one secure interface is seen on another secure interface in the same VLAN.

By default, a port security violation causes the switch interface to become error-disabled and to shutdown immediately, which blocks the HSRP status messages between the routers.

Workaround

Issue the standby use-bia command on the routers. This forces the routers to use a burned-in address for HSRP instead of the virtual MAC address.
Disable port security on the switch ports that connect to the HSRP enabled routers.

Case Study #11: %Interface Hardware Cannot Support Multiple Groups

If multiple HSRP groups are created on the interface, this error message is received:

%Interface hardware cannot support multiple groups

This error message is received due to the hardware limitation on some Routers or switches. It is not possible to overcome the limitation by any software methods. The problem is that each HSRP group uses one additional MAC address on interface, so the Ethernet MAC chip must support multiple programmable MAC addresses in order to enable several HSRP groups.

The workaround is to use the standby use-bia interface configuration command, which uses the Burned-In Address (BIA) of the interface as its virtual MAC address, instead of the preassigned MAC address.

Paramveer

Tuesday, 12 May 2015

Troubleshooting HSRP

Troubleshoot HSRP Case Studies

Case Study #1: HSRP Standby IP Address Is Reported as a Duplicate IP Address

Case Study #2: HSRP State Continuously Changes (Active, Standby, Speak) or %HSRP-6-STATECHANGE

Case Study #3: HSRP Does Not Recognize Peer

Case Study #4: HSRP State Changes and Switch Reports SYS-4-P2_WARN: 1/Host <mac_address> Is Flapping Between Port <port_1> and Port <port_2> in Syslog

Case Study #5: HSRP State Changes and Switch Reports RTD-1-ADDR_FLAP in Syslog

Case Study #6: HSRP State Changes and Switch Reports MLS-4-MOVEOVERFLOW:Too many moves, stop MLS for 5 sec(20000000) in Syslog

Case Study #7: HSRP Intermittent State Changes on Multicast Stub Network

Case Study #8: Asymmetric Routing and HSRP (Excessive Flooding of Unicast Traffic in Network with Routers That Run HSRP)

MSFC1

MSFC2

Consequences of Asymmetric Routing

Case Study #9: HSRP Virtual IP Address Is Reported as a Different IP Address

Case Study #10: HSRP Causes MAC Violation on a Secure Port

Case Study #11: %Interface Hardware Cannot Support Multiple Groups

No comments:

Post a Comment

About Me