登录查看更多内容

Log Federation in AWS

Alfred David

Tech Innovation Alchemist | AI-to-Blockchain Strategist | Building World-Class Engineering Teams | Future-First Leader

发布日期: 2016年9月10日

This is a classic Big Data workflow example, where large volumes of data needs to be moved from an enterprise to a cloud-based PAAS, where a data repository is created either for master data management or as a data warehouse; I’m using this to illustrate how logs can be managed on AWS

This Data flow architecture defines two components EMR & Data pipeline; The EMR is used for splitting the large file to equal sized chunks in a massively parallel process (MPP) utilizing the full power of EMR and the HDFS filesystem in the background.

The Data pipeline is used to copy the chunks of data in the various S3 buckets to the respective tables in the redshift snowflake /star schema. It also defines the 'unload’ command of redshift wherein exception logs of redshift are copied to an S3 bucket for further action.The logs of both the EMR and Data Pipeline is logged using CloudTrail, which logs to S3 buckets; each application and compute instance can be configured to log into separate instances of S3

An SNS component listens to the S3 buckets and sends notifications as an email message to administrators or product owners The event message can then be sent to a lambda component from which any aggregation or business logic can be applied to derive collective log intelligence to get actionable data.

The Lambda component can utilise a NodeJS based javascript code, a java/scala class or even a scripting code like python to effect the necessary logic.

This aspect can be further customised wherein log events are sent to third party log management tools such as Splunk, ELK etc and can be used for long-term detailed analysis of logs and reports generation

I’ve just highlighted how we would do this using AWS cloud components at a high level and how to manage the logs generated therein , so any anomalies in the pipeline are captured and there is a mechanism for further deep analysis and retrospection and also a facility to monitor key logging events .

Cloud trail allows us to configure separate buckets for each of the app services and compute instance types. CloudTrail adds another dimension to the monitoring capabilities already offered by AWS; it does not change or replace logging features you might already be using such as those for Amazon S3 or Amazon CloudFront subscriptions. Amazon CloudWatch focuses on performance monitoring and system health; CloudTrail focuses on API activity. While CloudTrail does not report on system performance or health, you can use CloudTrail in conjunction with CloudWatch Logs alarms to notify you about activities that you might be interested in, This I’ve depicted below.

要查看或添加评论，请登录

Alfred David的更多文章

AI's Blind Spot: The Enduring Challenge in Software Development

2025年1月6日

AI's Blind Spot: The Enduring Challenge in Software Development

The rise of AI tools in programming has been nothing short of revolutionary. These tools are transforming how…
Integrating OpenAI's GPT model with WhatsApp's API

2024年12月20日

Integrating OpenAI's GPT model with WhatsApp's API

This process involves setting up a WhatsApp Business account, using a third-party service to connect WhatsApp with your…
Deep Learning Mechanisms in Applications

2017年7月16日

Deep Learning Mechanisms in Applications

Deep learning is ravenously raved about right now, for those of us unaware of AI and its terminologies; well it is a…

3 条评论
Deep Learning Fundamental Reads

2017年4月24日

Deep Learning Fundamental Reads

I’ve been wanting to compile a list of reads I’ve done to get a basic understanding of deep learning as part of my…
Demystifying Infrastructure as Code (IAC)

2017年2月10日

Demystifying Infrastructure as Code (IAC)

What is infrastructure as code ? The concept behind infrastructure as code (IAC) is that you write and execute code to…

2 条评论
Cloud Native PaaS Tools

2017年1月2日

Cloud Native PaaS Tools

I recently was in a meeting with my organisations technology boffins and the discussion veered around the trend of how…
Tango with C4 on Azure

2016年11月30日

Tango with C4 on Azure

The new blueprint for software architects to represent a system architecture that they are designing is through the…
Tools for Deep Learning Neural Networks

2016年11月2日

Tools for Deep Learning Neural Networks

There has been a lot of banter on Deep Learning, which now is on the verge of slowly transcending from theoretical…
Before you say ' I Do ' to APIs

2016年10月25日

Before you say ' I Do ' to APIs

Well the thing is in the last couple years the entire EAI paradigm has completely transformed and monolithic SOA…

2 条评论
At the Cusp of AI

2016年9月14日

At the Cusp of AI

I’ve seen how the IT market in India has shaped itself and come about being called an IT powerhouse over the last 20 +…

2 条评论

See all articles

Log Federation in AWS

Alfred David

Tech Innovation Alchemist | AI-to-Blockchain Strategist | Building World-Class Engineering Teams | Future-First Leader

Alfred David的更多文章

社区洞察

其他会员也浏览了

Big Data Integration: Handling Large Datasets in Software

How to Get Started With ADF As a Beginner?

ETL with Azure Synapse

Revolutionizing Data Management in AWS: The Case for Apache Iceberg Over Traditional Table Formats

Introducing Complex Types with Extended Schema Evolution in DataForge Cloud 8.0

Snowflake

Azure Data Factory

Azure Data Engineer Interview questions with Answers 2024

What is Azure Data Factory?

Which Data Pipeline Orchestration Tool Is Right For?You? (ML4Devs Newsletter, Issue 16)

Alfred David的更多文章

AI's Blind Spot: The Enduring Challenge in Software Development

Integrating OpenAI's GPT model with WhatsApp's API

Deep Learning Mechanisms in Applications

Deep Learning Fundamental Reads

Demystifying Infrastructure as Code (IAC)

Cloud Native PaaS Tools

Tango with C4 on Azure

Tools for Deep Learning Neural Networks

Before you say ' I Do ' to APIs

At the Cusp of AI

社区洞察

其他会员也浏览了

Big Data Integration: Handling Large Datasets in Software

How to Get Started With ADF As a Beginner?

ETL with Azure Synapse

Revolutionizing Data Management in AWS: The Case for Apache Iceberg Over Traditional Table Formats

Introducing Complex Types with Extended Schema Evolution in DataForge Cloud 8.0

Snowflake

Azure Data Factory

Azure Data Engineer Interview questions with Answers 2024

What is Azure Data Factory?

Which Data Pipeline Orchestration Tool Is Right For?You? (ML4Devs Newsletter, Issue 16)