Page 1 of 1

ESXi 8.0 U3 becomes unresponsive when EVE runs

Posted: Sun Aug 03, 2025 3:33 pm
by mvanoverbeek
I just purchased EVE-NG Pro to help with customer labs I would like to build. The server installed properly, and I tested it out with an Aruba AOX-CX switch which works fine.
The problem I run into is that my ESXi 8.0 Update 3 becomes unreachable after a while. Potentially through a CRON job that runs every night because every morning I wake up the box is unreachable. This ONLY Happens if EVE is running. When I shutdown the EVE VM the machine runs fine and the server does not become unreachable over night.

My specs are:
CPU: 16 CPUs x AMD Ryzen 9 5950X 16-Core Processor
Memory: 127.92 GB
Storage: 2TB NVME

I allocated to EVE::
Sockets: 6 with 2 CPUs per socket = 12 CPUs
RAM: 48 GB
Storage: 300 GB

Does this sound familiar to anyone? Appreciate some guidance>

Re: ESXi 8.0 U3 becomes unresponsive when EVE runs

Posted: Sun Aug 03, 2025 8:24 pm
by rusty725
check for duplicate IPs on your network.

Re: ESXi 8.0 U3 becomes unresponsive when EVE runs

Posted: Tue Sep 23, 2025 7:43 pm
by mvanoverbeek
Thanks for your reply, I don't think I have duplicate IP addresses in my network.
It also doesn't happen instantaneously, mostly after a few hours.
Today for example, my ESXi ran stable for 41 days, I started EVE-NG worked on a BGP setup with IOL + some Aruba CX gear and two hours in the whole deployment stopped working
I ran with a ClearPass server, 2 Aruba Controller, 2 Aruba Conductors, an Ubuntu VM without any issues
Kind of unworkable

Re: ESXi 8.0 U3 becomes unresponsive when EVE runs

Posted: Tue Sep 23, 2025 8:46 pm
by rusty725
come to our chat https://webchat.eve-ng.net and give me anydesk id so I can connect and check.

Re: ESXi 8.0 U3 becomes unresponsive when EVE runs

Posted: Mon Oct 06, 2025 12:00 pm
by mvanoverbeek
I can provide an update on this item, after more troubleshooting. The ESXi log messages kept showing the following output:


vmkwarning: cpu20:2097728)WARNING: StorageDeviceIO: 201: Device t10.NVMe____TEAM_TM8FP6001T_________________________040A3339674CE000 performance has deteriorated. I/O latency increased from average value of 650 microseconds 2025-07-31T08:51:05.807Z Wa(180)
vmkwarning: cpu20:2097728)WARNING: to 16341 microseconds. 2025-07-31T13:10:41.969Z Wa(180)
vmkwarning: cpu20:2097728)WARNING: StorageDeviceIO: 201: Device t10.NVMe____TEAM_TM8FP6001T_________________________040A3339674CE000 performance has deteriorated. I/O latency increased from average value of 650 microseconds 2025-07-31T13:10:41.969Z Wa(180)
vmkwarning: cpu20:2097728)WARNING: to 16395 microseconds. 2025-07-31T13:11:02.004Z Wa(180)
vmkwarning: cpu20:2097728)WARNING: StorageDeviceIO: 201: Device t10.NVMe____TEAM_TM8FP6001T_________________________040A3339674

From what I understand after doing some digging on William Lam's homelab page and other searches I think the cheap TEAM NVME drive has poor driver support.

I decided to purchase a 2 TB Samsung 990 Pro with Heatsink to replace the two 1 TB TEAM NVME SSD I had. After this, all worked fine, not more lock ups!