登录查看更多内容

Active data, not passive storage.

Andrew Warfield

VP / Distinguished Engineer, Amazon

发布日期: 2015年8月19日

TL;DR: Today, Coho is announcing support for running containerized applications within our enterprise storage system. This is the beginning of something really, really cool.

Now, the longer version.

For as long as I have been alive, my father’s basement has been a mess.

It is absolutely full of stuff. He keeps everything. As a kid, my parents renovated the houses that we lived in. For those of you who haven’t renovated a house before, tearing down walls results in a lot of garbage that needs to be taken to the dump for diposal. I remember a running joke that it was dangerous for my father to bring a load of things to the dump for fear that he might come back home with more than he left with.

Now that I have kids of my own, this aspect of my father is a remarkable thing: every summer at around this time, I take my family to visit my (now retired) parents in rural Quebec. My children immediately disappear into my father’s workshop with a list of demands, only to return — hours later — with those ideas realized. This year, my oldest son surfaced with a work glove covered in LEDs with a push button switch on the thumb to turn them on. He excitedly explained to me that he was Tony Stark — Ironman — before tearing around the house and yard blasting things with his glowing hand.

To a casual observer, my father’s messy basement is effectively a disorganized closet, full of random tchotchke. To my children (and to me before them) it is a space full of unrealized opportunities.

This is exactly how I feel about the data that is stored in enterprise storage environments.

This week, Coho announced product support for launching container-based applications and services within our storage system. This is something that we have been working on for almost two years, and something that I’ve talked about in a couple of earlier blog posts. This article augments our latest announcement with a few additional details and to talk about where this is heading.

Integrated container support within an enterprise storage system is not just a container orchestration layer. It is a way to extend the storage system to take richer advantage of the data that it already stores. Container support is the first step in an exciting transition from being an enterprise storage system, to being an active data platform. As this aspect of our system grows and evolves, I believe that it will provide our customers with an opportunity to realize new value from the data that they already store, and to build end-to-end applications that harness and expose that value.

Converging containers into enterprise storage as a means of adding active execution to stored data is like turning a basement full of stuff into objects of desire: it allows the opportunity to analyze, transform, and present data in new ways, even as new scalable software services, within the enterprise environment.

What have we announced today?

The next major release of Coho’s DataStream platform software will allow our customers to instantiate Docker-based applications within the storage system. Our customers can add container images to a Docker registry that runs within the platform, and these images may be composed into rich applications and microservices that are described using Google Kubernetes APIs.

Microservice examples include new protocols (e.g. S3 API support), the ability to instantiate useful enterprise tools (e.g. Splunk light), big data capabilities (Cloudera’s CDH5 Spark and MapReduce) as well as a couple of developer-facing examples. The latter include both on-demand video transcoding, and a live-search facility to find and extract documents from VDI environments. Where possible, these extensions will be released as open source software and will be available for our customer community to modify and improve for their own needs. We will be demonstrating these examples at VMworld this year.

From the beginning, Coho has been building a network-integrated, scalable enterprise storage system. Adding this support, which we’ve been calling “container convergence”, is a big step toward a vision that we are building a scalable data platform rather than just a storage system. To me the distinction is that in a storage system — whether it’s a box of LUNs or a loftier “big data lake” — is still just a closet. It passively stores and retrieves data over a handful of protocols for external application stacks.

The deep integration that Coho has been building for containers is the first step in making the storage system active. Containers are a vehicle for adding new functionality directly on top of data where it is stored, and for analysing, transforming, and presenting that data in powerful new ways.

Compute in a storage system is not the same as folding storage into a compute platform.

Coho’s primary customer use case today is in proving scalable, high-performance storage for virtualized environments. Despite having a similar hardware form factor to many existing “hyperconverged” products, we took the early decision not to host virtual machines on our appliances. Instead, we took a strong focus on delivering a network-converged rack-scale storage system that addressed the things that our customers expected from enterprise storage: data protection, performance, and absolute operational simplicity at any scale. The resulting system achieves incredible performance density, and our customers get real value from Coho storage by deploying considerably more compute than storage nodes to “balance” the performance and capacity needs within their environments.

Importantly though, as a file and object-based storage platform, we have been concerned from the outset with building a system that is about data and about collaboration. File-level semantics mean that multiple clients can use our storage system as a point of collaboration: data lives within a single shared platform. This might be multiple users interacting through home directories, or it might mean the need to quickly launch a Spark analytics job on a directory full of log files. Regardless, our whole intention is in providing a centralized data platform that allows our customers to gain new and exciting benefits from the data they they own, over time, and in situ.

These two aspects: enterprise-class storage, and the collaborative nature of file and object-level APIs, are both very different from what we see in today’s existing “hyperconverged” products. Hyperconvergence is about packaging the entire virtualized software stack into a hardware appliance form factor: it solves a very important problem in terms of operational simplicity, especially at smaller scales. However, hyperconvergence doesn’t intrinsically lend itself to getting new value out of the data that it stores — instead, the storage component of a hyperconverged system is effectively an invisible SAN: VM data lives in VM file systems parked inside of some form of LUN. Hyperconvergence has solved a virtualization-specific storage problem (making storage invisible) in order to present the virtual machine as the core top-level abstraction. We believe in a more data-centric view of the world.

An active data platform

Converging containers into our storage system approaches this same design space from the opposite end of the spectrum: Rather than starting with VMs as the central abstraction in data center design, we are choosing to start with data itself. Coho has served VM images over NFS from the outset, but it’s those images and their data that we have always been concerned with. Container convergence means being able to quickly extend our system to make those VMs searchable, to integrate them with third-party log analytics or backup tools, and to augment customer environments with new data protocols such as S3. This support is hardly limited to VM images: as a scalable and general-purpose storage platform, container convergence is turning the traditionally boring role of enterprise storage into a platform that brings enterprise data to life.

When my kids disappear into the basement with a list of demands for their grandfather, I never know what they will resurface with — but I know they are going to be excited about whatever it is. As we move to release container support as a core facility within Coho’s DataStream software, I have an inkling of the same feeling about our customers: I’m very excited to see where this journey takes us.

Shriram Rajagopalan

Senior Staff TLM - Google Cloud. Co-creator of Istio

9 年

This is pretty cool! Two questions (probably tangential): on microservices architecture - are you using service proxies and client side load balancing (eg airbnb smartstack, netflix ribbon/eureka)? (This is the FT angle). For spark (may be a bit more naive), what kind of datastore model are you assuming? Hdfs or plain old file systems?

可可特李

技术员 | 企业家 | 顾问 | 作者

9 年

Andrew, that's fantastic. Is there a particular container management system integrated into Coho Coast?

查看更多评论

要查看或添加评论，请登录

查看全部

Active data, not passive storage.

Andrew Warfield

VP / Distinguished Engineer, Amazon

更多精彩文章

社区洞察

其他会员也浏览了

January 2024 Hammerspace Newsletter

California agency gains agility edge via Software-Defined Data Center

Navigating Modern Data Demands: 7 Reasons You Need a Containerised Data Centre!

Data Gravity: Navigating the Shifting Landscape of Data Management

What I’ve Learned About Data Centers So Far…

Introducing the Verida Storage Credit Token, VDA, powering the private Data Economy

Fileers - The Next Generation of Decentralized Data Storage

Storage and Data Protection News for the Week of September 27; Updates from Hitachi Vantara, Pure Storage, Rubrik & More

Maypole-Wireless file storage from anywhere|Vicharak|KCK Market

What is the Difference Between Block and File Storage?

International Women's Day 2024

2024年3月8日

My last first graduate lecture of the year

2017年10月4日

Google still loves disks, should you?

2016年4月12日

Evolve or die

2016年3月31日

Marketing your way to scale-out

2016年3月23日

Storage is now a network problem

2016年2月11日

The year of invisible infrastructure

2015年12月23日

The Private Cloud is Dead

2015年10月19日

Your AFA is not a hot dog

2015年9月3日

Welcome to the age of craft infrastructure.

2015年8月27日