DriveScale updates enterprise clustering and scale-out for modern workload

JOHN ABBOTT 22 AUG 2017

Stateless servers linked to JBOD storage provide a more efficient cluster infrastructure that is better suited for scaleout application workloads such as Hadoop. DriveScale is partnering with hardware vendors and service providers to provide ‘software composable infrastructure’ from standard hardware elements. 

DriveScale updates enterprise clustering and scale-out for modern workloadDriveScale wants to bring hyperscale computing to mainstream enterprises, making clusters more efficient and more dynamic by decoupling the servers from the storage. Using the DriveScale System, which includes a SAS-to-Ethernet bridge made by Foxconn, customers can stop buying servers with storage and implement a ‘software composable infrastructure.’ Stateless servers are linked to a JBOD (just a bunch of disks) storage array, enabling any servers, or groups of servers within the cluster, to use the pooled storage as if the disks were local. DriveScale, which doesn’t sell servers or storage itself, believes that scale-out, rack-scale architectures are becoming more important in the datacenter because an increasing percentage of modern workloads (particularly Hadoop and big-data applications) are now written to run natively on commodity scale-out clusters.

THE 451 TAKE DriveScale is taking on the high-end sector of the marketplace, where workloads are demanding and require scale. Its take on composability may be less all-encompassing than some competitors’, but it’s a better fit for those looking to utilize a large cluster, with the ability to repurpose the infrastructure for different workloads very rapidly. Hadoop and HDFS are currently the main focus, but there’s also Cassandra, ScaleIO, Ceph, Scality, GlusterFS and Couchbase, to name a few, and support for flash and Kubernetes will broaden things out further over time. Those looking for traditional SAN services should not apply – and that’s because SANs actively get in the way of these rapidly growing scale-out technologies. Over time, it looks likely that traditional apps that currently require SAN storage services will become the clients of these new frameworks.

CONTEXT Sunnyvale, California-based DriveScale emerged in May 2016 after three years in stealth mode, raising $15m in series A funding led by Pelion Venture Partners and including Nautilus Venture Partners and Foxconn’s Ingrasys cloud infrastructure subsidiary. Ingrasys has also acted as hardware co-developer for DriveScale. The founders all worked at Sun Microsystems. Satya Nishtala (CTO) and Tom Lyon (chief scientist) were both engineers at Sun (Lyon was employee No. 8) and went on to play key roles at Nuovo, the Cisco spin-in that formed the basis of Cisco’s UCS converged systems business. Duane Northcutt (VP) invented the Sun Ray thin client, and also worked as VP of technology at Andy Bechtolsheim’s Kaelia Systems, which Sun acquired in 2004 for its x86 server designs. CEO Gene Banman, who joined in March 2014, has been a CEO three times before: at NetContinuum (sold to Barracuda Networks in September 2007), Zero Motorcycles and ClearPower Systems. Before that, he also worked at Sun Microsystems in senior management positions. PRODUCTS Standard racks of commodity servers are common currency for hyperscale organizations such as Facebook, Google and Microsoft. Those companies, of course, have very specific workload requirements, complete control over their software stacks and virtually unlimited budgets. Traditional enterprises are now looking for a similar scale-out platform to run their modern workloads, such as Hadoop, Cassandra, Spark and Kubernetes. These new apps don’t fit the old model, where many virtualized apps are managed on each server with shared storage. Instead, they run on bare-metal servers and are written to run across clusters of hundreds or thousands of commodity scaleout servers with local storage. Old-style clusters, with closely-coupled compute and storage in the same chassis, aren’t flexible enough and require trade-offs that affect utilization rates and the ability to scale up and scale down resources in response to demand. The DriveScale System was intended to enable independent compute and storage resource scaling by disaggregating the compute and storage pools so that they can be recomposed and bound together in any ratio required by an application.

A DriveScale cluster is made up of diskless, stateless servers connected to JBOD storage arrays through the DriveScale Adapter, a 1U box made by Foxconn that includes four Ethernet-to-SAS modules. Each of those has two 12Gb four-lane SAS interfaces, and two 10Gb Ethernet interfaces. There are also dual redundant power supplies. A single chassis, with 80GB throughput, supports simultaneous access to 80 drives at equivalent performance levels to direct attached storage, according to DriveScale. The nodes are 2U, 1U, 1/2U or 1/4U servers from multiple suppliers – Dell, HPE, Cisco, SuperMicro, Quanta, Foxconn or others. Each node runs a DriveScale Server Node Agent that handles inventory discovery and the storage connection setup. Other software components include the DriveScale Management System, which installs with the Linux RPM package management system in a VM and takes care of cluster and node configuration, and the cloud-based DriveScale Central, for customer support, remote upgrades and remote licensing. One adapter per JBOD (two to four per rack) costs $10,000. Node licenses cost $2,000 per logical node per year (20-40 per rack). Drive licenses are $25 per drive per year (160-400 per rack). Adding those up, a rack with 30 nodes and 300 disks would cost $106,000 in the first year and $76,000 per year thereafter. DriveScale claims that the cost for its technology is more than made up for in more efficient utilization of the hardware investment and savings from being able to manage compute and storage lifecycles independently. STRATEGY DriveScale positions itself as a kind of VMware for clustering – whereas VMware tackled inefficiencies in servers, DriveScale does much the same for large clusters with hundreds or thousands of nodes and direct-attached storage. By consolidating clusters, the DriveScale System can improve resource utilization, make resources more adaptable to rapidly changing business requirements, save space and power, and reduce infrastructure and operational costs. It also separates compute and storage for procurement decisions and lifecycle management – customers no longer have to buy servers and compute resources at the same time. Sizable Hadoop and big-data implementations are the most immediate opportunities, but other emerging workloads, including social media, machine learning and artificial intelligence, are equally applicable. The company also has its eye on the ‘composable infrastructure’ trend, but says that composability doesn’t provide much value unless it’s done at datacenter scale – just a few servers or even a single rack doesn’t make much sense. DriveScale’s take on composable infrastructure doesn’t include the networking side, because it’s target customers typically use a flat network and don’t need load balancing or firewalls. But it scales to many more nodes and racks than  the alternatives. Other than its Ethernet-to-SAS Bridge, which it calls the DriveScale Adapter, DriveScale isn’t in the hardware business, and will sell in conjunction with hardware and service-provider partners, including Dell EMC, HPE and Cisco. Early customers include healthcare big-data specialist Clearsense and adtech giant AppNexus. Target market sectors include the tier of hyperscalers below Amazon, Google and Microsoft, and service providers offering vertical expertise and products delivered as a service, as well as enterprise accounts with large big-data applications in IoT, machine learning and data analysis. There are future plans to support flash storage and to provide a scalable persistent storage mechanism for containers, enabling Kubernetes containers to retain their disk assignments and continue to benefit from the performance of local storage when moved around. COMPETITION A number of other startups have been looking at the scale-out opportunity. Liqid, which recently raised $10m in series A funding, uses a PCIe switch and PCIe fabric networking rather than standard Ethernet, which DriveScale says won’t address the 200-node-plus clusters it’s most interested in. PCIe is ubiquitous in networking, storage and compute hardware, but despite a number of efforts it hasn’t established itself as an enterprise fabric so far. However, Liqid goes beyond just compute and storage to leverage and manage pools of networking and graphicsprocessing resources. Similarly, A3Cube, which puts more emphasis on artificial intelligence and deep-learning workloads, as well as big data and analytics, claims to integrate computing, in-memory caching, acceleration and I/O into a single massively parallel plug-and-play system. It also uses PCIe, but has built its own direct memory-to-memory networking capabilities, utilizing hardware-based shared memory. DriveScale takes the familiar  ‘Ethernet always wins’ position and believes alternatives such as InfiniBand, Intel’s OmniPath, PCIe and RDMA will remain also-rans.

Cisco, HPE and Dell have all been using the term ‘composable,’ but require the use of their own servers and storage products rather than commodity servers and JBODs. Scalability is also typically confined to a single-rack chassis. Intel’s Rack Scale Architecture (now rebranded Rack Scale Design), first demonstrated in 2013, points toward a similar vision, where compute, storage, memory and I/O subsystems are procured as modular units for use as pooled resources at the rack, or even at the datacenter level. Intel has identified three phases of disaggregation: physical (shared power, cooling and rack management), fabric (rack fabric, optical interconnects, modular refresh) and subsystem aggregation (compute, storage, pooled memory, shared boot and shared BIOS). However, tying it into emerging technologies such as silicon photonics may have slowed early adoption. Intel recently opensourced RSA as Rack Scale Design to encourage OEM participation. Ericsson (with the Hyperscale Datacenter System 8000), Quanta’s QCT division (with the Rackgo-X-RSD) and SuperMicro (with SuperMicro RSD) have signed up. Intel also has a related HPC-focused effort called the Scalable System Framework, which includes the Lustre file system, parallel processing tools and HPE Orchestrator software. Startups such as Alluxio and Cancun Systems offer what are basically caching systems to boost the performance of HDFS. NVMe-over-fabric companies such as Excelero have some competitive overlap (particularly with DriveScale’s future flash support), but it’s really an enabling technology that abstracts the infrastructure but doesn’t put it back together again. All of these technologies are part competitive, part candidates for partnerships. Standardization efforts such as the Open Compute Project and emerging consortia such as GenZ and CCIX may similarly benefit DriveScale, although they aim to cover some areas beyond compute and storage that DriveScale itself may not tackle for several years. As distributed storage-class memory emerges, that could be utilized by DriveScale to provide a localized resource to the servers. NVMe fabric, which boosts the performance and latency of traditional storage drives and SSDs, is something it can take immediate advantage of.








要查看或添加评论,请登录

Jeff Chesson的更多文章

社区洞察

其他会员也浏览了