Surfing the Data Tsunami in IT Ops
It is a massive understatement to say that IT infrastructure has changed in recent years. Developers have been empowered and they are choosing from a veritable smorgasbord of technologies. So, IT and DevOps teams have more options to work with, technology is getting cheaper and more powerful, tech is more flexible and more scalable, and they’re getting more and better data about the systems they deploy.
But people have not changed. Big Data has created a nightmare scenario for the IT professionals tasked with keeping applications running. In the data center, this crisis has translated to a tsunami of alerts and warnings about breakages and outages that have overwhelmed the ability of us mortals to investigate and prioritize.
As human beings, we’re only capable of dealing with so much complexity. And this issue, of course, will get worse: software systems are being componentized and containerized, and more endpoints (mobile and IoT devices) will become part of the overall systems. For instance, according to Business Insider Intelligence, by 2020 there will be nearly 35 billion devices generating 40,000 exabytes of data. And according to Cisco’s most recent Global Cloud Index, global data center IP traffic will triple in the next few years to almost 9 zettabytes annually.
For DevOps professionals, answering a question about the health of the product or service being provided to the customer is no longer a simple task. They’ve gone from using a single, monolithic monitoring tool to needing dozens of specialized monitoring tools spewing tens of thousands or hundreds of thousands of data points, status updates, warnings and alerts. The massive challenge is figuring out what is correlated and what’s not, what requires immediate attention and what doesn’t.
Whereas IT pros used to monitor a fairly stable stack of technologies provided by a few very large vendors, today’s hyperscale environments require hyperscale monitoring. A large company, for example, might have hundreds of small apps and microservices working together to power a customer service. Those apps might run on 10,000 machines across dozens of data centers around the world. Each of those apps and microservices needs to be monitored, as do the servers, the network, the database, and the end user experience.
And this isn’t an issue impacting just tech companies. Every company today runs on software and is providing services via software. Just ask United Airlines about what happens when its reservation system goes down (hint: the entire company (and its planes) come to a standstill). Or ask Starbucks about the chain reaction set off when its Point of Sale system crashes.
For the people on the IT staff monitoring these systems, the solution lies in simplification and in finding the right technology that can automate and prioritize incidents and cut response times, and do so across all of the technologies being used by an enterprise. And it has to make it easy to understand what’s happening, what the root problem might be, and what the possible solutions are. Big Data will only get bigger and people will continue to struggle to efficiently process it, understand it and react to it – there is a huge opportunity to bridge these two worlds.
We’re betting on Assaf Resnick and his team at Big Panda. They have the data science chops and the capabilities to make sense of the various systems and volume of Big Data at the core of today’s IT systems. Congratulations Assaf on today’s funding announcement and we’re looking forward to partnering with Big Panda.