Analytics Shift towards new uniformity
Sibendu Nag ?
Analytics & Optimization | Digital Transformation Trainer & Speaker | PhD Scholar, Neuromarketing | LinkedIn Top Voice | Member-Indian Robotics Automation Council, Education Steering Committee, FTCCI |SG Global Community
Every now and then, we are surrounded with multiple clickstream datasets in our work and often we have to compile that datasets to some wonderful presentations for better understanding and usability. The technologies we use today for clickstream analytics do have some amazing BI interfaces, e.g., Adobe Analytics workspace does have set standards around data visualization. However, when it comes to analyzing the data with multiple sets of reporting and sources, there arise the trivial task of data uniformity. If we consider having analysis with one source of clickstream data, then it is fine to have the BI interface pull and share the data presentations. But what if you need to have a glance into some other sources like offline datasets, mobile applications, kiosks, over-the-top (OTT) platforms, or other technologies like Salesforce, Hubspot, etc.? And yes, not to ignore the big limitations especially around data prototype and speed. We need to even understand that when we have multiple set of reporting coming from technologies like Adobe Analytics, Marketo or Salesforce, it is so critical to build the connectivity between all the system to have one single truth of data.?
Now, we all know that the data set that the clickstream data from Adobe Analytics produces is a raw data and is quite huge. If we can map this data with some other sources and explore this to its core, will be of definite use considering that lot more meaningful inferences can be made based on this. Almost 95% of time, we have the explanation that the clickstream data is quite robust, and to come out of the same old excuse we can definitely leverage Adobe’s Data Feeds here.?
Today, importing data from a source to a destination is considered to be one of the most complicated tasks. If there is a process in place, you can easily upload, transform and present a complex set of data to your data processing engine. Recently, I had to work around transferring the Adobe Analytics data to Snowflake, however main challenge has been the fact that Adobe doesn’t provide this out of the box, though you might plan to use ETL tools such as Fivetran, Alooma, Stich, etc.?However, the question is, do you want to go for some connector or you want to manage this internally with some of the in-build processes in the organization? Here the idea is to use AWS S3 as a medium, so that we can build a pipeline that goes from Adobe Analytics to S3 to Snowflake.?
With this point in view, I decided to create an approach that can help in the process of ingesting data from Analytics interface to AWS S3 and finally into the Snowflake environment. There are many ways to import data from Adobe Analytics into Snowflake. However, I find the below solution more useful. Most importantly, you don't have to go for any connector and make the process dependent. Here are some of the process steps that can help design the entire flow from Adobe Analytics > S3 > Snowflake.
Basic Execution: Setting up the AWS S3 Bucket
Very first step, would be log in to the AWS console and set the S3 bucket with some unique name. Then we can follow the settings as per the requirements. Once the bucket is created, you can check the files and can upload/download files to the bucket.
Next would be to set the Adobe Analytics (AA) user on AWS with S3 permissions. This will give AA the read and write access to S3. The same credentials would be required while setting the automated feed from Adobe. When the user is given a name, we need to also check the box labelled “Programmatic access”. And we do need to have the user credentials define here, especially the Access Key and Secret Access Key.?
Getting the data from Data Feeds
As a next step, we can now proceed with the Adobe Analytics data feed, where we need to set the recurring data feed. Based on the requirements, we can select the frequency for the daily feed, also need to be sure about the size of the files as well.
领英推荐
Process to implement the same will start from Data Feed Manager in the AA interface (Analytics > Admin > Data Feeds). We need to be doubly sure about some of the details mentioned below.
Once all the information are provided, save the data feed. To transfer the data to S3, especially the first hour of data it may take upto 30 mins. In the Job Feeds, the details can be tracked about the progress. Once done, you can check the S3 bucket for the files that are migrated.?
Once the process of S3 migration is done successfully and we are able to eventually populate the files, we can now move towards the final process of taking this data to the warehouse.
Creating an external Stage
Now as a next step for bulk loading the data into Snowflake, we can use the existing buckets and folder paths that of S3. The process of loading from S3 bucket is mentioned below:-
This process of data migration, which we can term as ?Analytics Shift?is to make the process of data modelling much easier and efficient. Whatever the file size is (rows and columns), businesses can now easily manage a shift towards reading and interpretating data more accurately and that too in no time. The shift from one of the most demanding Analytics tool to Amazon Simple Storage Service (Amazon S3) is surely not a one day course and would definitely need some understanding as to data transfer frequency, actual data and lookup tables and file format. But once you are ready with the schema specifically around the server-side and client-side encryption, you are then all set to access data that is more business-oriented to solve any level of complexity and duplicity. Now why wait, start supercharge your business decision making process with automated reports and pertinent information that goes directly from your Snowflake environment to the BI tools.
-------------- Kick start with the Integration Path ------------------
Nicely explained. Thanks for taking the pain of putting it together. Kaushik, Rattanender you may want to go through it.
Principal Data Architect - Digital Analytics | Helping make sense of data - The Why of the what !
2 年Thanks for posting, wonderful read
Digital Data Governance & Compliance | Adobe Analytics | Enterprise Digital & Data Solutions | American Express | Driving Innovation in Digital Experiences
2 年Good read. Thanks for sharing ??
Senior consultant at EY | Digital Analyst/Web Analyst | Uncovering Actionable Insights to Amplify Online Presence and Drive Business Growth | Adobe Certified
2 年Very knowledgeable Sibendu Nag ?