It was well over a year after HP announced their Gen 9 ProLiant line that we began using the new platform to build out VMware hosts. We wanted to leverage the latest v3 Xeon processors for our heaviest hitting and most critical workloads. When adopting new hardware, we’ve come to expect a few bumps in the road and this was definitely one of those cases. After adding these hosts to our vCenter and making sure all firmware and driver versions were on par with HP’s latest recipe, we noticed something not quite right in the VMKernel logs.
vmkernel: cpu7:33181)<3>bnx2x: [bnx2x_drv_ctl:16677(vmnic2)]U
nknown cnic driver state: 16675
These hosts were configured with QLogic 536FLB 10Gb LOMs.
After a quick Google search, we found VMware KB2141475 (which has since been pulled) that stated this error was caused by a version mismatch between the installed firmware and drivers. After double checking our bnx2x driver and firmware versions, we verified they were in fact current as of the January 2016 HP recipe.
We opened HP case no. 4766218256 and after uninstalling the driver and even replacing the LOMs, we still didn’t have a resolution. HP escalated this case to engineering who sent us this HP advisory.
According to HP engineering, this issue is believed to be caused when network partitioning (NPAR) is enabled. However, after disabling NPAR the issue persisted on our hosts. NPAR allows us to split up our 10GbE adapter into four or more network adapters. According to HP, the error state will change (“unknown cnic driver state: 16675, 16767, 16768, etc.”) depending on whether the NIC’s are using virtual connect assigned MAC addresses versus hardware MAC’s.
The good news is there’s a new firmware & driver combination available which has fixed this issue in our environment. The bad news is it’s not part of the latest HP recipe (Jan 2016 as of this writing). The links are available below if you still want to test them for yourself.
While working this case, we simultaneously tested the Emulex 650FLB 20GB LOM’s which Optio Data graciously offered for testing. The 650’s worked flawlessly under the approved recipe and has since replaced the QLogic boards as our new standard.