What is Real-time Analytics?

What is Real-time Analytics?

If you've ever ordered food online and tracked it, you've used a data application with embedded real-time analytics. If you've ever used the Facebook or Linkedin Newsfeed, you've seen real-time analytics in action.

Real-time analytics means fast queries on fresh data. Fast queries means sub-second responses. Fresh data means anything from a second old to a few hours old (depending on how often your data source is updated) As the world moves from static reports to interactive data applications, it is becoming clear that real-time analytics is a driving force in many industries from logistics to e-commerce to gaming to fintech, adtech, agtech, edtech and literally every industry that is digitized today.

Gartner defines real-time analytics as:

“The discipline that applies logic and mathematics to data to provide insights for making better decisions quickly.”

When thinking about data applications powered by real-time analytics, there are are essentially two measures of latency, data latency and query latency. Often the data latency requirements depend on the data source and use case (could be a few hours if its coming from third party sources into your lake), but the query latency requirements are sub-second because these are interactive user-facing data applications

Batch vs. real-time analytics

Batch analytics is high latency analytics where queries return results on data that is at least minutes old. In contrast, real-time analytics is optimized for low latency analytics, ensuring that data is available for querying in seconds.

One use case for batch analytics is business intelligence reporting. Business intelligence uses historical data to report on business trends and answer strategic questions. In these scenarios, the goal is to use data to craft strategy; not to take immediate action. Real-time data would not generally impact the result of the trend analysis, making this better suited for batch analytics.

Batch analytics use cases like business intelligence, reporting and data science have less stringent latency and therefore can tolerate ETL pipelines to homogenize and enrich data for analytics. In contrast, real-time use cases have low latency requirements and attempt to reduce or remove the need for ETL processes.

Many analytics systems like Hadoop and data warehouses were designed for batch analytics. Batch analytics systems process the data in batches, data is collected and loaded into the system over a period of time. Rather than having an “always on” system for data processing, they can restrict data processing to specific time intervals to reduce costs. Batching also helps with data compression, reducing the overall storage footprint and making it economical for periodic analytics on large-scale data.

In contrast, systems designed for real-time analytics have native support for semi-structured data and other modern data formats to avoid ETL processes and achieve low data latency. They are also optimized for compute efficiency to reduce the resources required to constantly process incoming data and execute high volume queries.

Companies like Facebook moved many features from batch to real-time analytics, including the display of content on the newsfeed. You can learn more about the journey in the tech talk: How We Scaled It: Facebook’s Online Data Infrastructure to 1B+ Users.

Why are companies accelerating real-time analytics adoption?

Increase revenue and user engagement

Google once did a study that half a second delay kills user satisfaction. Nobody wants to stare at a spinning wheel while the dashboard loads. Snappy, responsive experiences have been proven to increase user engagement metrics like daily active users. Embedded real-time analytics gives users a better experience, they don’t have to wait seconds to minutes for data or queries to load. They can interact quickly with the data, providing a seamless user experience.

Increase operational efficiencies

Teams on the ground can slice and dice data for quick decision-making. With sub-second query latencies, users can ask several questions of the data and reach a decision in a matter of minutes. This makes users more productive, increasing the number of decisions they can make in a day.

Take timely actions

Teams can become more efficient, relying on applications for a subset of decision-making and focusing attention towards larger, strategic initiatives. There are use cases that are inherently time-sensitive: catching security vulnerabilities, optimizing delivery routes or bidding on advertisements. If you waited minutes for the data to be processed and queryable, you would lose out on the window of time to make an impact.

Real-time Analytics Explained

In this Real-time Analytics Explainer Guide we discuss the different use cases and types of data applications being built today. What data applications are you building? How are you using real-time analytics today? 

Giovanni Tropeano

Marketing Lead | Demand Generation | Pipeline Growth

3 年

With the convergence of the explosion of data + the hunger and consumption of app usage for both consumers and businesses, the need to real time insights from the data is exploding. I was recently interviewing a guest for Rockset's podcast (Why Wait? The Rise of Real Time Analytics - check it out!) and he said something simple yet profound: he said "Real-time analytics is hard." He has been in data and infrastructure for 15-20 years in all the house-hold name big tech companies and he bubbled it down to, "it's hard." Real-time analytics is only just now becoming a widespread need that is moving from a "cobbled together" solutions to an all-in-one offerings. The future of rta is something to watch for sure.

回复

要查看或添加评论,请登录

Shruti Bhat的更多文章

社区洞察

其他会员也浏览了