I´m triying to bring up a topology that contains several nodes. Until now, i never tried to bring up the whole nodes. I already know that the HW is limited.
Servers specs:
Hypervisor: VMware ESXi, 6.5.0, 5969303
Modelo: UCSC-C220-M4S
Type of processor: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Logical Processors: 32
NIC: 4
Total memory: 128Gb
Images used:
XRv9-k9full 7.6.1 x8
XRv-6.6.2 x2
CRS1000v 17.03.05 x4
NE40E V800R011C00SPC607B607 x15
vMX 18.4R1.8 x1
The NE40 , vMX and XRv works flawless but the XRv9 and CSR1000v have problems after be initialized a couple of minutes. The XRv9 crash or looks like goes down itself and the CSR display a log related with the CPU and goes freeze for a few minutes, after that it comes back again
XRv9
Code: Select all
0/RP0/ADMIN0:May 11 18:10:25.150 UTC: vm_manager[3262]: %INFRA-VM_MANAGER-3-MSG_HEARTBEAT_FAILURE : VM default-sdr--1 failed to maintain heartbe
0/RP0/ADMIN0:May 11 18:10:25.169 UTC: sdr_mgr[3216]: %SM-SDR_MANAGER-3-MSG_VM_RELOAD_ON_HB_FAILURE : Info :SDR NM : VM Reload on HB failure, sdr
0/RP0/ADMIN0:May 11 18:10:25.170 UTC: sdr_mgr[3216]: %SM-SDR_MANAGER-3-MSG_VM_UNGRACEFUL_RELOAD_TOO_OFTEN : Info :sdr default-sdr vm_id 1 ungrac
[18:10:44.777] Sending KILL signal to processmgr..
[18:10:44.777] Sending KILL signal to ds..
PM disconnect successStopping OpenBSD Secure Shell server: sshdinitctl: Unknown instance:
The audit system is disabled
Stopping system message bus: dbus.
Stopping random number generator daemon.
Stopping system log daemon...0
Stopping kernel log daemon...0
Stopping internet superserver: xinetd.
Stopping crond: OK
Stopping rpcbind daemon...
done.
Libvirt not initialized for container instance
Deconfiguring network interfaces... done.
Sending all processes the KILL signal...
Unmounting remote filesystems...
Deactivating swap...
Unmounting local filesystems...
Connection closed by foreign host.
Wed May 11 18:11:24 UTC 2022 (/opt/cisco/hostos/bin/xr_con_telnet_wrapper.sh): XR console connection lost to port 9001
Code: Select all
*May 11 18:24:10.581: %PLATFORM-4-ELEMENT_WARNING: R0/0: smand: RP/0: 5-Minute Load Average value 9.49 exceeds warning level 8.00.
*May 11 18:24:44.445: %EVENTLIB-3-CPUHOG: R0/0: hman: undefined: 1311ms, Traceback=1#08ca21ba637c850b75436450ffff3b6d c:7FA1A2665000+37370 c:7FA1A2665000+15BC9C :564DD7383000+2CDCA :564DD7383000+2D518 :564DD7383000+49343 uipeer:7FA1ACCD2000+3F6A9 uipeer:7FA1ACCD2000+1ED06 evlib:7FA1AE2F7000+9145 evlib:7FA1AE2F7000+9A9C orchestrator_lib:7FA1A94CC000+CE31 orchestrator_lib:7FA1A94CC000+CDB4
*May 11 18:24:44.472: %EVENTLIB-3-CPUHOG: R0/0: hman: undefined: 1135ms, Traceback=1#08ca21ba637c850b75436450ffff3b6d c:7FA1A2665000+37370 c:7FA1A2665000+EACA4 c:7FA1A2665000+7BCFB c:7FA1A2665000+7BE9D c:7FA1A2665000+6FFA2 procmib_lib:7FA1A7581000+6472 :564DD7383000+4FAB4 evlib:7FA1AE2F7000+9145 evlib:7FA1AE2F7000+9A9C orchestrator_lib:7FA1A94CC000+CE31 orchestrator_lib:7FA1A94CC000+CDB4
*May 11 18:25:00.072: %EVENTLIB-3-CPUHOG: R0/0: smd: write asyncon 0x55df3a8908e8: 136ms, Traceback=1#aacc8f6f6ff3ee394cf2c4311553234a c:7F38368EF000+37370 pthread:7F3836AAF000+117FA bipc:7F384D54A000+5192 evutil:7F385A7E4000+9CD2 evlib:7F385B6CB000+8D8E evlib:7F385B6CB000+9A9C orchestrator_lib:7F385B4A7000+CE31 orchestrator_lib:7F385B4A7000+CDB4 luajit:7F3837461000+7C696 luajit:7F3837461000+35C44 luajit:7F3837461000+BFF9
It is posible that the problems were related with the storage ?? (the eve VM its located in a Vmware Datastore, not in the local storage)
I need to do some test with IS-IS (migration from OSPF), MVPN control-plane and if it's posible, SR-MPLS.
Regards