Migrating to Proxmox Virtual Environment (PVE) 8.3 - Part 1: 40 Gbit network cards.
Jorge Eduardo Delgado S.
Azure & AWS | VMware VCP-DC 6 | Proxmox VE | MCITP, MCP | Microsoft 365 admin | Fortinet, Meraki & Sonicwall firewalls & VPNs | SysOps | SRE | Design | VMware migration
Not all companies have big IT budgets and fewer small-business even have an IT budget. So when the opportunity arises, you have to seize it.
Since the very beginnings of VMware I saw it as a great technology because, as a SysAdmin, I know a lot of servers sit still for long periods of time.
Now, I have been tasked with refreshing our small data center hardware (network and servers). So I created my wishlist: many cores, SSDs, and good networking. Maybe HCI ???
There are horror tales on the Internet about issues with HW and I did work on phone tech support for a few years for a large brand's laptops. On the other hand (and in my own experience) as far as you have quality servers with decent specs and proper environments, it is rare to have issues.
Small business budget??? Well... used hardware it is!! Quality hardware. After a couple of months of researching, I came up with very nice specs: 88 cores, 384 GB RAM, U.2 NVMEs, and 40 Gbit networking. Per node. Oh my!!!! ??
It is incredible the used HW you can get, which was creme-of-the-creme when launched. And for a fraction of the original price.
We decided to part ways with VMware and, after researching, the best option we found was Proxmox VE.
So, I'll be posting not-so-common details of this journey. For now, let's talk about 40 Gbit networking.
I'll be using CEPH for storage, so it requires as much bandwidth as possible for their cluster (replication) network as well as their public (clients) network.
I found very LOW prices on Mellanox (now NVidia) ConnectX-3 dual 40Gbit network / infiniband cards. So good that I got 3 for each node! ??
And good prices on Cisco Nexus 36-port 40/100Gbit switches. ??
I was not expecting the whole setup to be all 100% smooth. So the first wrinkle was this: these cards were not recognized as Network cards at first.
Given that these cards can be configured as Infiniband or Ethernet, I already found during my research that they may need some tweaking.
I choose not to use the NVidia-provided drivers as they are outdated for the card model we have (you can use those if your adapter family is ConnectX-4 or newer). Instead, I'll use the mlx4_en included with PVE/Debian.
First, check if the driver is loaded:
root@pve-01:~# lsmod | grep mlx4
mlx4_ib is for Infiniband, mlx4_en is for Ethernet. If you don't see the Ethernet one, read on.
Let's install the utilities to work with these cards:
领英推荐
root@pve-01:~# apt install mstflint
Now, list PCI devices to get their IDs
root@pve-01:~# lspci | grep -i mellanox
5e:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
86:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
d8:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Now, let's put the PCI device address in a simple variable to simplify checking each card we have (change the address in the variable for each card you have):
root@pve-01:~# DEVADDR=d8:00.0
root@pve-01:~# mstconfig -d $DEVADDR q
Device #1:
----------
Device type: ConnectX3
Device: d8:00.0
Configurations: Next Boot
SRIOV_EN True(1)
NUM_OF_VFS 16
LINK_TYPE_P1 IB(1)
LINK_TYPE_P2 IB(1)
LOG_BAR_SIZE 3
BOOT_PKEY_P1 0
BOOT_PKEY_P2 0
BOOT_OPTION_ROM_EN_P1 True(1)
BOOT_VLAN_EN_P1 False(0)
BOOT_RETRY_CNT_P1 0
LEGACY_BOOT_PROTOCOL_P1 PXE(1)
BOOT_VLAN_P1 1
BOOT_OPTION_ROM_EN_P2 True(1)
BOOT_VLAN_EN_P2 False(0)
BOOT_RETRY_CNT_P2 0
LEGACY_BOOT_PROTOCOL_P2 PXE(1)
BOOT_VLAN_P2 1
IP_VER_P1 IPv4(0)
IP_VER_P2 IPv4(0)
CQ_TIMESTAMP True(1)
The LINK_TYPE_P1 and LINK_TYPE_P2 show the mode set for the ports. IB is Infiniband, ETH is Ethernet. So, let's change the mode for the ports:
root@pve-01:~# DEVADDR=5e:00.0
root@pve-01:~# mstconfig -d $DEVADDR set LINK_TYPE_P1=ETH LINK_TYPE_P2=ETH
Now reboot so we can confirm the change is permanent at the firmware level, allow the normal system detection of the adapter ports and loading of the correct kernel driver. After the reboot, check the ports now show the correct type, and new ethernet ports have been detected (cleaning the output for readability):
root@pve-01:~# ip link
1: lo:
2: eno3:
3: eno4:
4: eno1:
5: eno2:
6: enp94s0:
7: enp94s0d1:
8: enp134s0:
9: enp134s0d1:
10: enp216s0:
11: enp216s0d1:
Items 6-11 are the new ports detected.
Now, let's check if these adapters have the latest firmware. The latest version for my model (MCX354A-FCB) is 2.42.5000. You can find it at the URL:
Extract it, go into the resulting folder and now check the version on each card:
root@pve-01:~# DEVADDR=86:00.0
root@pve-01:~# mstflint -d $DEVADDR q
Image type: FS2
FW Version: 2.42.5000
FW Release Date: 5.9.2017
Product Version: 02.42.50.00
Rom Info: type=PXE version=3.4.752
Device ID: 4099
Description: Node Port1 Port2 Sys image
GUIDs: f452140300471130 f452140300471131 f452140300471132 f452140300471133
MACs: f45214471131 f45214471132
VSD:
PSID: MT_1090120019
Look at "FW Version". If it is lower than 2.42.5000 (in this case), proceed to upgrade it:
root@pve-01:~# DEVADDR=5e:00.0
root@pve-01:~# mstflint -d $DEVADDR -i fw-ConnectX3-rel-2_42_5000-MCX354A-FCB_A2-A5-FlexBoot-3.4.752.bin --use_fw burn
Reboot once again to make sure the configuration remains after flashing the firmware.
In the next article, we'll talk about testing the bandwidth.
Bye for now!
System Integrited Department Manager
1 个月An excellent path towards open source??