Log Federation in AWS

This is a classic Big Data workflow example, where large volumes of data needs to be moved from an enterprise to a cloud-based PAAS, where a data repository is created either for master data management or as a data warehouse; I’m using this to illustrate how logs can be managed on AWS

This Data flow architecture defines two components EMR & Data pipeline; The EMR is used for splitting the large file to equal sized chunks in a massively parallel process (MPP) utilizing the full power of EMR and the HDFS filesystem in the background.

 The Data pipeline is used to copy the chunks of data in the various S3 buckets to the respective tables in the redshift snowflake /star schema. It also defines the 'unload’ command of redshift  wherein exception logs of redshift are copied to an S3 bucket for further action.The logs of both the EMR and Data Pipeline is logged using CloudTrail, which logs to S3 buckets; each application and compute instance can be configured to log into separate instances of S3

An SNS component listens to the S3 buckets and sends notifications as an email message to administrators or product owners The event message can then be sent to a lambda component from which any aggregation or business logic can be applied to derive collective log intelligence to get actionable data.

The Lambda component can utilise a NodeJS based javascript code, a java/scala class or even a scripting code like python to effect the necessary logic.

This aspect can be further customised wherein log events are sent to third party log management tools such as Splunk, ELK etc and can be used for long-term detailed analysis of logs and reports generation 

I’ve just highlighted how we would do this using AWS cloud components at a high level and how to manage the logs generated therein , so any anomalies in the pipeline are captured and there is a mechanism for further deep analysis and retrospection and also a facility to monitor key logging events .

Cloud trail allows us to configure separate buckets for each of the app services and compute instance types. CloudTrail adds another dimension to the monitoring capabilities already offered by AWS; it does not change or replace logging features you might already be using such as those for Amazon S3 or Amazon CloudFront subscriptions. Amazon CloudWatch focuses on performance monitoring and system health; CloudTrail focuses on API activity. While CloudTrail does not report on system performance or health, you can use CloudTrail in conjunction with CloudWatch Logs alarms to notify you about activities that you might be interested in, This I’ve depicted below. 


要查看或添加评论,请登录

Alfred David的更多文章

  • AI's Blind Spot: The Enduring Challenge in Software Development

    AI's Blind Spot: The Enduring Challenge in Software Development

    The rise of AI tools in programming has been nothing short of revolutionary. These tools are transforming how…

  • Integrating OpenAI's GPT model with WhatsApp's API

    Integrating OpenAI's GPT model with WhatsApp's API

    This process involves setting up a WhatsApp Business account, using a third-party service to connect WhatsApp with your…

  • Deep Learning Mechanisms in Applications

    Deep Learning Mechanisms in Applications

    Deep learning is ravenously raved about right now, for those of us unaware of AI and its terminologies; well it is a…

    3 条评论
  • Deep Learning Fundamental Reads

    Deep Learning Fundamental Reads

    I’ve been wanting to compile a list of reads I’ve done to get a basic understanding of deep learning as part of my…

  • Demystifying Infrastructure as Code (IAC)

    Demystifying Infrastructure as Code (IAC)

    What is infrastructure as code ? The concept behind infrastructure as code (IAC) is that you write and execute code to…

    2 条评论
  • Cloud Native PaaS Tools

    Cloud Native PaaS Tools

    I recently was in a meeting with my organisations technology boffins and the discussion veered around the trend of how…

  • Tango with C4 on Azure

    Tango with C4 on Azure

    The new blueprint for software architects to represent a system architecture that they are designing is through the…

  • Tools for Deep Learning Neural Networks

    Tools for Deep Learning Neural Networks

    There has been a lot of banter on Deep Learning, which now is on the verge of slowly transcending from theoretical…

  • Before you say ' I Do ' to APIs

    Before you say ' I Do ' to APIs

    Well the thing is in the last couple years the entire EAI paradigm has completely transformed and monolithic SOA…

    2 条评论
  • At the Cusp of AI

    At the Cusp of AI

    I’ve seen how the IT market in India has shaped itself and come about being called an IT powerhouse over the last 20 +…

    2 条评论

社区洞察

其他会员也浏览了