Operational & Analytics workloads - Part #2 CosmosDB & Azure Data Explorer
"Image Data Exploration (IDEX) II Work Station" by Ryan Somma is licensed under CC BY-SA 2.0

Operational & Analytics workloads - Part #2 CosmosDB & Azure Data Explorer

In the previous post, I tried to illustrate why you could make better business decisions if you get into the habit of thinking at business problems without separating transactional and analytical workloads (pedantic note: separation does not rule out distinction ;-)). In fact, in many instances they go hand in hand.

In this instalment, we will see a concrete example of how we can technically accomplish such "integrated view" by looking at the way Azure CosmosDB and Azure Data Explorer (ADX) work together. Just a few words about ADX.

ADX is a big data analytical database, designed for low latency near real-time analytics scenarios. It works as an append-only data exploration engine. Compared to traditional analytics solutions, here we have a system that ingest raw data (structured and unstructured), allows you to query it, and explore patterns, trends, etc. And it scales to terabytes of data, in minutes, allowing rapid iterations of data exploration to discover relevant insights.

Let's look at the integration now.

Fundamentally, the key to achieving near real-time analytics is to loosely couple the data analysis from the online transactional systems. For instance, picking a scenario we alluded to in the previous post, customers buying products online, experience an ad hoc promotion as soon as they check out. In this case, the recommendation has to work as fast as possible without affecting the transactional performance at all.

That's what we got with ADX! the ability to query fast-flowing data without affecting the OLTP system's performance AND keep warm data that is frequently accessed. Cold data can then easily exported out to a low cost cold storage mechanism such as Azure BLOB.

On the other hand, Cosmos DB is an operational hot store where data can be stored for few days. As soon as data changes e.g. the check out process we have mentioned above, CosmosDB will then flow this data to ADX. Let's take a look at the two prongs of the architecture.

No alt text provided for this image

The magic of the integration in a loosely manner occurs by leveraging a CosmosDB built-in feature called Change Feed. This is a mechanism to listen to all the changes occurring in a collection. For instance, a customer checks out, a change related to his profile occurs, and this is logged and available for consumption. In fact, you can have an Azure Function, a serverless way to run code based on events, to actually process and push the data out. To move the data to ADX, an Event Hubs then can be used to ingest all the events (changes of the data) and make it available for the analytical platform i.e. ADX to analyse it at a lightning speed.

No alt text provided for this image

The whole reference architecture and much more details are discussed on this very useful lab on Github.

In the next and last post of this mini-series, we will look at more use cases and integration between CosmosDB and Azure Synapse.

In the meantime, enjoy Ignite!

要查看或添加评论,请登录

Michele Arpaia的更多文章

  • Digital Platforms: Innovazione e Standardizzazione

    Digital Platforms: Innovazione e Standardizzazione

    Una brevissima riflessione a latere dell'evento (stupendo!) #platmosphere organizzato e diretto da Mia-Platform…

    3 条评论
  • Software is a Knowledge Medium. Forget that at your Peril!

    Software is a Knowledge Medium. Forget that at your Peril!

    First Appeared on my blog: Mens Et Opera (michelearpaia.blogspot.

  • Software as A Human Activity Between Art and Science

    Software as A Human Activity Between Art and Science

    (Published on my personal blog) In preparation for my new role at VMW last year, I started to brush up on my…

  • On the peril of confusing Streams and Messages - Part 1

    On the peril of confusing Streams and Messages - Part 1

    Over the last 6 months or so I've been part of many customer and partner workshops where the key topic was to review…

    2 条评论
  • Operational & Analytics Workloads - Part #1 Convergence

    Operational & Analytics Workloads - Part #1 Convergence

    When you look at the world from a customer perspective, it often offers an opportunity to transcend Conway's law and…

  • The Document Versioning Pattern in Azure Cosmos DB

    The Document Versioning Pattern in Azure Cosmos DB

    In high regulated industries, such as Finance, Healthcare, Insurance, etc., tracking histories of some portion of the…

    4 条评论
  • The philosophy of Azure Cosmos DB

    The philosophy of Azure Cosmos DB

    Recently, I have been reflecting on the very inception of products and services, especially from the consumer…

  • Azure Art

    Azure Art

    Greetings to all from locked down Rome. Azure, the color, has a long history.

    2 条评论
  • NoSQL? No Party!

    NoSQL? No Party!

    The NoSQL market first emerged in 2009, although much of the technologies and concepts have been in existence for at…

    1 条评论
  • The Rationale Behind Marketing Clouds

    The Rationale Behind Marketing Clouds

    Marketing cloud solutions are a fantastic idea. The underlying motive is simple: in a world with an unprecedented and…

社区洞察

其他会员也浏览了