AWS Elasticsearch for Log Visualization

This is the sixth article in a series about implementing an AWS Serverless Web Application for global customers of a large enterprise. If you are lost, please read the first article.

One of the most interesting applications I have architected in recent times is a completely serverless web application hosted on Amazon Web Services. “Serverless”, in addition to being a cool technology for me, is a real breadwinner for me. Besides that, it has the potential to save hundreds of thousands of $ for my clients over months to come. All this comes with built-in super-high performance of Dynamo DBAWS LambdaCloudFront and other cloud native technologies. I cannot stop effusing my sales pitch for FastUp and Amazon Web Services, so, I won’t get started.

Getting back to the topic: For a serious enterprise (think Fortune 1000), serverless technology paradigm brings great advantages, and also begs a new way of thinking in compute, storage, content and also peripheral solutions such as backup, log visualization, security and such. Today’s post is about Log Visualization. Having gone serverless, Application and Infrastructure logs land up in a variety of places and are very hard for developers and operators to locate, search and respond to production incidents. There are many third party solutions which provide complete Log Visualization services — such as Splunk and many others. While these third party solutions do provide superior features, they can be very expensive. Instead, Elasticsearch and Kibana are open source solutions that come very close to required functionality and are very inexpensive to run (for our use case). Amazon provides Elasticsearch with Kibana as a service under it’s group of “Analytics” services. Elasticsearch and Kibana are very versatile products and one should do independent research on those capabilities. For this post, I will focus on how we used it for Log Visualization. Here is a diagram.

Fig. 1 Here is how our (mostly) Serverless stack sends log events to Elasticsearch via Kinesis Firehose and Lambda

Overview

On the left, on that diagram, are all services we have implemented for our Serverless Stack. Note that there are a few EC2 instances, those are for a legacy authentication solution for a product that we will be replacing with AWS Cognito very soon. The one service that is not in here is S3 — where we store our HTML and JavaScript and also User Generated Content. We don’t push access logs for S3 into Elasticsearch simply because we don’t need for day to day log visualization. We may need S3 access logs for forensics in case there is a security breach incident. For that, we have a completely different solution.

End User Access Logs

The top two boxes on the left (in the diagram) are CloudFront and the Application Load Balancers. When an end user downloads a file from CloudFront or attempts to authenticate against our Legacy Authentication Solution, these two services send access logs to an S3 bucket. CloudFront has a well defined log format and so does the ALB. That S3 bucket is configured to notify a couple of Python Lambda functions when a new log events are written to it. This Lambda function transforms a log message into JSON format and pushes it into a Kinesis Firehose Delivery Stream.

API Access Logs

The third box is the API gateway which logs it’s access and error logs into a CloudWatch Log Stream in a Log Group. The CloudWatch Log Group name corresponds to the API gateway id and stage and hence does not change. This allows us to easily integrate this stream to another Python Lambda functionthat transforms log messages to JSON and push those to the Kinesis Firehose Delivery Stream.

Application Event Log

Finally, the two bottom boxes on the left are all our compute (Lambda and EC2). Within our code, we send all log events as they arise into the Kinesis Firehose Delivery Stream using the AWS DotNet SDK. These log events are already formatted in the JSON format.

Delivery to Elasticsearch

The Kinesis Firehose Delivery Stream is configured to deliver all data into an AWS Elasticsearch Cluster once it has 1 MB of data to deliver or a minute has passed (aka Buffering). This way, all logs are delivered into Elasticsearch eventually within a minute or minute, thirty.

Log Visualization

AWS Elasticsearch comes with a Kibana front end where our developers go and search for log events. It is blazing fast and very featureful in terms of filtering in interesting logs, free text search, making visualizations of certain metrics and such things. Elasticsearch has a concept of index templates which we have implemented so that logs are indexed as we like and are sorted by log event time.

The only disadvantage of using AWS’ Elasticsearch/Kibana setup is that AWS does not yet support X-Pack and hence, we cannot do user authentication for the Kibana front end. That is not fun for log events that are security related or are sensitive in nature. For majority of Enterprise use cases, this should not be a big deal yet. Hopefully, AWS will release an X-Pack update for Elasticsearch. Or will they?

EDIT: As of today, AWS has released support for authentication via AWS Cognito.

As I mentioned previously, there are other alternatives to using AWS’ Elasticsearch. There may be supported solutions in the AWS Marketplace or third party services such as Splunk.

要查看或添加评论,请登录

Sachin Dole的更多文章

  • Where to apply AI First

    Where to apply AI First

    There are several ways to solve business problems and seize new opportunities using AI. The key for most businesses is…

  • Parts of an AI Platform

    Parts of an AI Platform

    Building an AI Platform should go a long way for an enterprise to deliver value to business units. In this short…

  • What is Generative AI - for the non-techie

    What is Generative AI - for the non-techie

    By now, everyone has written about the basics of this topic and ChatGPT probably has answered this question a million…

  • How to succeed at Generative AI Projects

    How to succeed at Generative AI Projects

    McKinsey Digital has published a comprehensive article targeted to CEOs to break down Generative AI along several…

  • Five Characteristics of an AI-Driven Future Built for Everyone

    Five Characteristics of an AI-Driven Future Built for Everyone

    I first published this article in Newsweek on Mar 22, 2022. Over the previous 16 months, this topic has come up in…

  • 4 stages of Enterprise AI Portfolios

    4 stages of Enterprise AI Portfolios

    There seems to be no proven playbook available for building AI capabilities in large enterprises. Consultants and…

    2 条评论
  • Implementing Generative AI in an Enterprise

    Implementing Generative AI in an Enterprise

    It took us several weeks to arrive at a problem statement for our customer support voice, chat and ticket routing team.…

  • My Leadership Principles

    My Leadership Principles

    I have recently experienced positive career events that led to several transitions in a short amount of time. Change is…

    4 条评论
  • I sold my business

    I sold my business

    I sold my business. Specifically, I sold the operations of my business to a customer who wishes to remain unnamed.

    12 条评论
  • AWS Serverless: Final Thoughts

    AWS Serverless: Final Thoughts

    This is the seventh article in a series about implementing an AWS Serverless Web Application for global clients of a…

    3 条评论

社区洞察

其他会员也浏览了