Lessons from the Great Server Migration of 2023
Photo Credit: Josh Sorenson. https://www.pexels.com/@joshsorenson/

Lessons from the Great Server Migration of 2023

Phew! Completed the great server migration of 2023. We've now consolidated 100+ VMs into 5 colocated servers running #AlamaLinux 9. Huge shout out to my sysadmin Ramesh for all the help. ChatGPT might be the future of everything but it's going to be a long time before a AI replaces a real human expert.

Here are some learnings that I found interesting.

Why not use a Bare Metal Hypervisor?

Why use a full OS like #AlmaLinux rather than a bare metal hypervisor like #VMWare #EsXi?

Well.. convenience and flexibility. We consolidated multiple roles into a single server, so the machines also serve as #networkattachedstorage (#NAS), #DNS, #NTP, and #OpenLDAP for single signon. One of the machines also acts as a #OpenVPN gateway. We could have run those services on a virtual machine, but given how heavily used they are, they benefit from not having a virtualisation layer in between to slow them down.

It is Easier to Recreate a VM than to Migrate it

While doing the great migration, we also standardised on the virtualisation technology. We were using a mixture of #VirtualBox, #KVM and #HyperV. We've now eliminated #VirtualBox from the list. While convenient to use on laptop, it has limitations particularly around performance that make it unattractive as a virtualisation solution. #KVM is much more difficult to grok, but once, understood, is extremely powerful.

In all cases though, migrating a VM between machines with different CPU types was a huge pain. Even if all the CPUs in question were #Intel #x86_64. Workloads running on a workstation class CPU (even a XeonW) will need to have their XML configs re-written when moving to servers running XeonE. Go figure... This is a case of #Intel fine-graining the segmentation of its markets. Apparently 'real' bussiness never run VMs on a workstation.

We found it easer to simply re-provision the VMs from scratch (upgrading the OS from #CentOS to #AlmaLinux in the process.)

The process was repeatable and more importantly scriptable. That made the migration significantly easier.

Systemctl is the Root of All Evil

Well, Ramesh and I have agreed to disagree on this one. Apparently, by default services controlled by #systemd will no longer write to /var/log/messages on Fedora . One has to use @#$%$%^ #journalctl. Which, get this, stores the logs in *binary* format!!!!!??? One can restore the original config, by changing journald config but that information was buried deep in a wiki on the Fedora web page.

Debuging an #SSH issue took unnecessarily long given the difficulties of figuring out how to use journalctl.

Gigabit Ethernet Baby

Yeah, yeah I know, every one is using 10GbE now. But we were still on a 100Mbps network. While upgrading, we also switched to gigabit ethernet routers through the network. It really has improved the experience of running virtual workstations.

Remote Developer Workstations

Now each of us has multiple dedicated virtual workstations. (In addition to our physical laptop or desktop). This makes development much easier. We've settled on #Ubuntu 22.04 and #Fedora Workstation for virtual workstations and #AlmaLinux for our servers. We also have a bunch of #Debian11 VMs to make baseline debian packages. Of couse, we also have Windows, macOS and iOS devices.

Should You Not Run it all on the Cloud?

No. And go away.

要查看或添加评论,请登录

Sumanth Vepa的更多文章

社区洞察

其他会员也浏览了