The long road to HCI - Where it started from
The year was 2001. As a storage guy for several years by that point, and having seen how unnecessarily complex and expensive storage and compute was (somewhat by design), I had a feeling that some simplification was in order. With the recent introduction to the market of VMWare's GSX product, running on Linux, I thought it was time to do something about it and created (working with some interesting friends and a finance guy) the RhinoMax converged platform merging virtualization, online primary storage, nearline secondary storage, and a tape library along with a backup package into a single box. It worked really well and we made it through our first beta (seen above in a customer DC in Houston). Unfortunately, the moral of the story is never take your financial backing from VP's at Enron and Worldcom. Then the DotCom bubble popped and the project got shelved. Back to the work-a-day.
Fast forward a couple of years - circa 2003 - and the need to converge and collapse out the stacks and the extra complexity raised it's head again. I was at a tape library vendor at the time and my CEO and the head of Advanced Engineering approached me looking for cool ideas for the next generation of tape libraries. I asked myself, why not pull the compute and disk storage directly into the library itself? It would radically reduce complexity and connectivity issues, while making the library the centerpiece of the datacenter. Enter the I-Qip - Intel processors and primary storage moved directly into the library, right alongside both backup management and Hierarchical Storage Management (the original HSM acronym) to maximize internal primary storage efficiency by leveraging the inherent capacity strengths of local tape, all the while largely eliminating storage protocols, etc. Again, it worked amazingly well, and at the internal SKO, the teams were loving seeing it in action, but at the end of the day, the company didn't want to be seen as competition to the server vendors of the day (the Dell's and HP's of the time), so the I-Qip went the way of the RhinoMax One Box.
Jumping forward a few more years to July of 2011. After a stint with a Storage Management startup leveraging SNIA libraries, then a run at LeftHand Networks to it's eventual sale to HP, I had joined up with a startup company a couple of years prior that was focused on doing clustered affordable storage (similar to LeftHand Networks), but with a converged spin - both block and file level storage. Very cool stuff, using Linux at it's base on each node with GPFS to map storage across the entire cluster at the time. Linux KVM had been out for several years by this point, and RedHat had long since acquired it's creators - Qumranet. Time for the converged bug to bite again, but in earnest this time. It struck me how much value could instantly be added to the storage platform by simply moving the kvm kernel modules into the running kernel on each node in the cluster, homing the qcow2 virtual hard drives directly on the GPFS based filesystem (to inherit fault tolerance), and enabling live migration of the resultant VMs between the nodes for high availability. We could also use VMM as an interim GUI for VM management. By doing this, a SysAdmin would never need to deal with external connectivity to VMWare again, and could eliminate the entire stack of legacy servers and VMWare licensing costs - "How about I make about half of that quote disappear" was the phrase I used on my first customer presentation a few months later. That July, at an All Hands meeting, I brought the subject up with my CEO and my CTO, talking about how doing so could instantly add massive value to the companies' products. They were interested, but a bit guarded, and not much happened.
Fast forward to Thursday, October 19th 2011. This time, I wasn't going to let the idea go - I just knew it was the right thing to do. I reached out to the kernel maintainer on the engineering team to get a kernel specific version of the necessary kernel modules.
Friday, October 20th. The engineer/ kernel maintainer for the team gets back to me with the modules I wanted, but was curious what I was going to do with them. I told he I would show him the next week.
Saturday, October 21st. 3 of my 5 kids were down sick with the flu. Down hard with it. Spent the entire day and half the night getting them settled in, and couldn't sleep thereafter, so went downstairs to my lab (later called "The Lab of Doom" by a bunch of industry folks and the name stuck). I decided to try to make this work - I really, really believed in it. I worked through the rest of the night and into the following Sunday. Sunday evening, I sent an email to the C-Team at the company that went something like this:
Hi Gents,
?For several months I have been playing with the idea that there is no reason, with a fully clustered solution like ours, to go outside the box for a hypervisor. ?I have spoke to each of you in turn about it a various points, but most heavily this past July in Indy. With the heavyweights of the industry( EMC, Cisco, etc) bringing a similar but unclustered solutions to the market, I felt it was time to act. To that end, I have started the work, in my spare time this weekend, to get Kernel Virtual Machine (AKA Red Hat Virtualization) running on the nodes in our clusters alongside our stuff and homed on top of GPFS (/fs0/virt to be precise). I am happy to report that that is about 95% done - I have a couple of minor version mismatches to deal with on virt-intel.ko, but all the shared libraries and daemons/services and dependencies are now there, as is the virt core & GUI, & guess what – all our code continues to run beautifully. The virtualization piece really acts as I expected it would in that it simply adds value quickly to our existing platform & does so very inexpensively to us (wouldn’t hurt to add a bit of RAM) The cluster is happy & no effect on our running code! I hope to have a running VM on a running cluster later this week. Once I have the right versions of kernel modules in place, It should only be a matter of a day till everything is up. I will then get the live migration piece running between nodes for the VM’s. I settled on using the 10gig M cluster as it makes 4 gigabit nic available for my VM bridged nics without impacting bond0/bond1 that the cluster uses. Likewise, I have found a way to pipe the virt manager GUI out via the http export of vnc & it works great.
领英推荐
Then I finally went to bed.
That Monday morning, I went to work on resolving the kernel mismatch issues, normal day job stuff, got an updated set of kernel modules and kept after it. By late that evening, everything was ready, but the kids were still sick, so dad duty took precedence, and I set it aside for the night.
The following day, the 25th of October, what would become Hyperconverged Infrastructure was born. I sent an email to the exec team saying simply "Vision realized - it works!" or something very similar, along with a screenshot of the first VM running on the cluster
After the stir that email caused - endless phone calls, and me calling my CEO, jumping on a webex session to demonstrate it and essentially saying during said call "Hold my beer and watch this sh*%" then showing him first hand what we had (lightning in a bottle), things got very busy and very interesting very quickly. Within a matter of days, the company had adopted this approach as primary moving forward, and the demonstrations to the analysts began. Specifically with the Taneja Group. In that crazy long meeting, along with the live demo from my prototypes, Arun Taneja coined the term "Hyperconverged Infrastructure" to describe what we had here (I still have the "receipts" from all of it). The term was literally coined to describe my prototype. Now that is really cool and heady - talk about leaving your mark on an industry.
There is so much more that went into launching what amounted to an entirely new category of computing, and sadly, the term Hyperconverged didn't get copyrighted, so everyone else glommed on to it (went from calling themselves "Server San" to HCI really, really quickly - you know who you are...). Many minds applied themselves to the concept, and new features, a new storage stack, and so much more rolled out at a ferocious pace.
There is much more to the story - another decade and a half's worth. That said, HCI/Hyperconverged Infrastructure that you all know and love, well, you can thank my kids and influenza for it existing, along with an idea that I just couldn't let go of for a bit over a decade, and yes, I still have my original prototype running here in the Lab of Doom.
Chief Technology Advisor - The Futurum Group
1 周It has me thinking about the eventual collapse of the AI stack into an appliance model and the innovations or the rethinking of I/O for that consolidated stack. Thanks for the walk down memory lane.
IT strategy & architecture. IT P&L. IT M&A. Data Security Officer. 99.999% infrastructure & TOGAF process for enterprise-scale operations. Product architect, organization & operations builder for service providers.
1 周Thanks for sharing your HCI history. Similar to HCI, the credit for ‘zero trust’ didn’t go to its originators.
Ah, what we now call the good old days
IT Manager
1 周This would make a great book Alan!! ??