Lifting the Lid of Salesforce Genie's Architecture

Lifting the Lid of Salesforce Genie's Architecture

Last week I wrote an article about my initial opinion and understanding of Salesforce Genie. Shortly after I published the article, I went to a session with the Salesforce product management team who explained in detail how Genie works and how it is embedded in the platform.

In this article, I will focus on summarising what I heard and will also attempt to compare how Genie differs from existing platform integration patterns.

Let's talk about architecture

Don't be fooled by the pretty picture at the top. Whilst I fully agree that this is a beautiful marketing slide, there is one bit in there that is important to note, the "Real-Time Genie Hyperscale Platform".

No alt text provided for this image

To better understand this, let's look back at how Salesforce is essentially built at its core. If you pay attention to the bottom "Transactional Data" layer of the diagram, you are looking at the traditional relational database that every Salesforce org shares with all tenants of the instance. In that database, organisations store everything from the Account, Contact, and Opportunity to Custom Object data. The key point to call out here is that the data is physically stored in that database and hence you can run triggers, flows and process automation on the data with ease.

To make sense of all that data that is stored in the "Transactional Data" layer, Salesforce has its "Unified Metadata Dictionary" which describes the data structures to enforce data integrity (e.g. a number field must hold a number) when saving records. With those two layers embedded at the core of the platform, Salesforce then adds layers for "Security & Access Control" to ensure data is only visible to the relevant users in a specific org and their AI layer "Einstein" which can leverage AI models to drive insights (note that Einstein sits above the Security & Access layer and hence only access data in your org).

The three layers above for core platform capabilities (IDAM, APIs, Automation, DX etc.), "Lightning Design System" (the entire web component architecture and how it interacts with the Salesforce platform) and then all "Applications" or "Clouds" (e.g. Sales, Service etc.) are the layers that we as consultants, architects and customers interact with daily.

Where Genie fits in

Why am I explaining all of this, well, when considering how deep down the "Transactional Data" layer sits within the overall platform architecture, it opens the question of where Genie's "Real-Time" data sits and what it is. To cut it short, Salesforce has embedded a new Data Lakehouse layer to its platform that sits next to the traditional data layer.

No alt text provided for this image
Data Lakehouse? I am glad you ask.

I am by all means no data lake expert, but essentially a Data Lakehouse architecture "is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of?data lakes?with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data". Read more about the architecture concepts here.

No alt text provided for this image

So by adding a new Data Lakehouse layer, we now get the ability to store significantly larger volumes of data whilst maintaining some benefits of traditional relational databases like ACID transactions. But that's not all, because the Lakehouse is embedded in the core of the platform and stores the data we now can drive automation of the back of it. We can also start to run queries against the data, use data pipelines to prepare new datasets without the need of any ETL and connect it to BI applications like CRM Analytics etc.

How to bring data into the Lakehouse?

No alt text provided for this image

To go even deeper into the architecture, the question that arises is how do we get data into the Data Lakehouse and what tools will be available to support the import? There are essentially two ways:

No alt text provided for this image

  • Real-Time (or near real-time ms - minutes) using web and mobile applications to perform real-time ingests or MuleSoft to ingest data via streaming
  • Batch ingest (minutes - hour) either through batch ingestion from Salesforce applications like Marketing, and Commerce or core products like Sales and Service. But it is also possible to bring in data from other 3rd party sources like S3 etc.

No alt text provided for this image

On top of all that, you also can bring your own Data Warehouse. If you happen to have for example an existing Snowflake deployment, you can mount your data lake/warehouse with zero copies into Genie which provides the benefit that we can act on data that doesn't even reside with Salesforce' Lakehouse - I think that fact in itself is very beneficial for customers that have existing data lake deployments.

How to get data out of the Lakehouse?

No alt text provided for this image

As you can imagine, given that the idea is to deal with very large data volumes (billions of rows) it is important to process data efficiently. To do so, Genie is leveraging Spark, which "is?a general-purpose distributed processing system used for big data workloads. It has been deployed in every type of big data use case to detect patterns and provide real-time insight". Essentially as data is being processed, at the end it outputs action triggers which result in form of Platform Events, Webhooks and invocation of journeys that will help us to automate business processes and logic.

What is in it for you?

No alt text provided for this image

The fact that you can access large data directly within Salesforce without the need for more connectors opens up quite some opportunities. It's certainly not the holy grail for everything but the ability to query and access large data volume (LDV) in real time natively in Salesforce is pretty powerful. Equally, the ability to trigger flows to execute when data changes whilst maintaining CRUD and FLS is great.

Lastly, with this architecture, Salesforce opens the door to migrating Marketing Cloud and Commerce Cloud directly into Salesforce. In the past, it has always been difficult to integrate those platforms due to the volumes of data they generate.

Does this replace everything else?

Absolutely not. There are still many use cases to synchronise data into the transactional database in Salesforce (e.g. ETL, or MuleSoft etc.) or to virtualise data entirely (e.g. Salesforce Connect etc.).

I hope this helped to provide greater detail. As I learn more I will update this article.

Priya Mishra

Management Consulting firm | Growth Hacking | Global B2B Conference | Brand Architecture | Business Experience |Business Process Automation | Software Solutions

2 年

Jannis, thanks for sharing!

回复
Nidhi Gupta

17x Salesforce Certified Application and System Architect/Salesforce Solution Architect

2 年

Thank you for composing this article, very informative and helps understand the concept :)

回复
Prolay Chaudhury

Salesforce Practice Lead | Lead Solution/Technical Architect | E2E Architectural Solution| Trailhead Ranger| Active Listener| Problem Solver| Learner| Blogger

2 年

Excellent explanation of Salesforce Genie architecture, Jannis Kearney Bott ? My question is how do I access the data lake as a developer? Just like we create custom objects on traditional Salesforce database using UI or Metadata API or do we have the separate APIs to bring data into the Salesforce data lake just like Rest or Soap API?

回复
Anshul Verma

Executive IT Strategist | 20+ Years in Digital Transformation | CRM & Automation Expert | Writer

2 年

Thanks Jannis Kearney Bott ? .. This is probably the best explanation of Genie till now. I believe there are obvious winners where given automated lakehouse dev/maintenance/access for Salesforce data. I wonder what would be key considerations getting into it, given large enterprises often have more complex data platforms/architectures.

回复
Chellappa Karimanoor Nagarajan

Solutions Architect at Asana

2 年

Thanks Jannis Kearney Bott ? for sharing this. The way you explained is very nice. It answered some of the doubts i had from the Dreamforce Round Table Talks session.

回复

要查看或添加评论,请登录

Jannis Kearney Bott ?的更多文章

  • From Genie to Einstein 1

    From Genie to Einstein 1

    Given it's become a tradition for me to write blogs after Dreamforce, this year is no exception..

    3 条评论
  • Opinion: Salesforce Announces Genie

    Opinion: Salesforce Announces Genie

    We are on day 3 at #DF22 and after having watched multiple keynotes and visited every Genie booth I could find, I am…

    14 条评论
  • My 5 takeaways from the Platform Partner Advisory Board

    My 5 takeaways from the Platform Partner Advisory Board

    This week I had the privilege to be one of the few people that attended the Salesforce Platform Advisory Board in New…

    3 条评论
  • Salesforce Centric System Architecture Diagrams

    Salesforce Centric System Architecture Diagrams

    TDX '19 is just over and I am still humbled about all the positive feedback that I received for my presentations. Quite…

    39 条评论
  • Salesforce OAuth - Which flow should I use?

    Salesforce OAuth - Which flow should I use?

    As part of my Salesforce CTA preparation, I have spent countless hours of digging deeper and deeper into the different…

    54 条评论
  • How I became a Certified Technical Architect

    How I became a Certified Technical Architect

    As I am writing this article, I need to keep pinching myself to check if this really just happened. Yesterday, I passed…

    50 条评论

社区洞察

其他会员也浏览了