Diving Deep into Amazon CloudWatch

Diving Deep into Amazon CloudWatch

Introduction

Management and monitoring the resources on AWS are important. It helps in making sure that applications perform as expected, they are running in the most cost-efficient manner, secure, and highly available.

You would also want to monitor the consumption of your application and identify technicalities that may affect performance and availability.

As a management routine, you also need to be able to audit your environment with access to information such as access patterns and identify anomalies that may indicate potential performance or security issues.

What is Amazon CloudWatch?

Amazon CloudWatch is a service that enables you to monitor your AWS resources and applications, running on AWS as well as on-premises in real time.

With CloudWatch, you can also collect resource and application metrics, logs, and events. These data can therefore help in analysis and identify trends.

CloudWatch can also be used to configure Alarms that monitor metrics. If the metrics breach certain thresholds for a specified period, an alarm is generated on which action can be taken to remediate.

The metrics can be either built-in or customized. Metrics exist in the region in which they are created, but you can also configure cross-account, cross-region dashboards in order to gain visibility of those metrics, logs, and alarms.

Amazon CloudWatch Use Cases

1. Infrastructure monitoring and troubleshooting: With key metrics that can be collected, potential issues and bottlenecks can be identified. A root analysis can be conducted based on the data and resolution can be made.

2. Proactive resource optimization: With alarms, metric values can be monitored and triggers can be set if breaches occur. An automatic remediation action can be defined, for instance, by configuring auto-scaling and terminating failed instances. When such happens, a notification can be sent to administrators and system operators.

3. Log analytics: With the help of metrics and logs, the information obtained can be used to address operational issues, potential security attacks, or application performance issues and take effective actions to remediate.

Dashboards

Dashboards are created on Amazon CloudWatch to allow visualization and monitoring of resources and the metrics that are essential. They are configured to provide insights on resources’ health across AWS accounts and Regions

This is an example of AWS CloudWatch Dashboared:


Alarms

An alarm can be configured to monitor a given resource metric such as CPU utilization of an EC2 instance. If the set threshold for a specified time is crossed, the alarm is triggered to make certain actions.

For example, an alarm can be triggered when the average CPU utilization of an EC2 instance goes above 80% for 15 minutes.

When an alarm is triggered, an automatic action such as a Simple Notification Service (SNS) notification is taken to respond to it.

Alarms can be in three states: OK, Alarm, or Insufficient Data.

  • OK—occurs when metric is within the range defined.
  • Alarm— occurs when a metric has breached a threshold for a period.
  • Insufficient data—occurs when the data needed to make the decision is missing or incomplete.

Actions

These are activities taken in response to an alarm.

Examples include:

  • Simple Notification Service (SNS) notification: You can send out automatic alerts to an administrator (application-to-person, or A2P) or push a notification to an application to take some action (application-to-application, or A2A).
  • Auto Scaling action: The EC2 Auto Scaling service can be triggered to add or remove an EC2 instance in response to an alarm state.
  • EC2 action: You can have an alarm trigger an EC2 action, such as stopping an EC2 instance, terminating it, restarting it, or recovering it.

CloudWatch Logs

Logs from both AWS sources, such as EC2 instances, CloudTrail logs, Route 53 DNS queries, and VPC logs, and non-AWS sources, such as your web application access logs and error logs, can be centrally collected and stored in Amazon CloudWatch.

These logs can be queried, analyzed, searched, and filtered for a specific pattern or error codes. They can also be visualized on the dashboard.

Logs do not expire and therefore can be kept indefinitely. However, the retention policy can be adjusted by choosing the retention period between 1 day and 10 years. The logs can be archived into an Amazon S3 using the Glacier classes for long-term storage.

CloudWatch Events

With events, rules can be created to continuously monitor AWS resources and then respond with an action when a given event occurs.

An event can be something like an EC2 instance entering the stop state because someone performed a shutdown operation on the instance or when an IAM user logs into the AWS Management console.

You define a rule to perform an action that is taken when a certain event occurs to what target or resource.

CloudWatch Events can help you respond to operational changes to complete workflows and tasks or take any corrective action if required. CloudWatch events can also be used to schedule automated actions that trigger at certain times to help repeatable day-to-day operations. You utilize standard rate and cron expressions for this.


Let’s do some Practice

  • Create an Amazon SNS notification
  • Configure a CloudWatch alarm
  • Stress test an EC2 instance
  • Confirm that an Amazon SNS email was sent
  • Create a CloudWatch dashboard

1. Configure Amazon SNS

In this task, we create an SNS topic and then subscribe to it with an email address.

Remember, Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.

Steps:

1. In the AWS Management Console, enter SNS in the search bar, and then choose Simple Notification Service.

2. On the left, choose the button, choose Topics, and then choose Create topic.

3. On the Create topic page in the Details section, you can configure the following options:

  • Type (standard or FIFO)
  • Name, e.g MyAlarm
  • Other optional features such as Encryption, Access policy, Tags etc

4. Choose Create topic.

5. On the MyAlarm details page, choose the Subscriptions tab, and then click Create subscription.

6. On the Create subscription page in the Details section, you can configure the following options:

  • Topic ARN: Leave the default option selected.
  • Protocol: From the dropdown list, choose Email.
  • Endpoint: Enter a valid email address that you can access.

7. Choose Create subscription.

In the Details section, the Status should be Pending confirmation. You should have received an AWS Notification — Subscription Confirmation email message at the email address that you provided in the previous step.

8. Open the email that you received with the Amazon SNS subscription notification, and choose Confirm subscription.

9. Go back to the AWS Management Console. In the left navigation pane, choose Subscriptions.

The Status should now be Confirmed.

In this task, you created an SNS topic and then created a subscription for the topic by using an email address. This topic is now able to send alerts to the email address that you associated with the Amazon SNS subscription.


2. Create a CloudWatch alarm

In this task, you view some metrics and logs stored within CloudWatch. You then create a CloudWatch alarm to initiate and send an email to your SNS topic if the Stress Test EC2 instance increases to more than 60 percent CPU utilization.

Note: CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), IT managers, and product owners. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, and optimize resource utilization. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events. You get a unified view of operational health and gain visibility of your AWS resources, applications, and services running on AWS and on premises.

Steps

1. In the AWS Management Console, enter Cloudwatch in the search bar, and then choose it.

2. In the left navigation pane, choose the Metrics dropdown list, and then choose All metrics

3. On the Metrics page, choose EC2, and choose Per-Instance Metrics.

From this page, you can view all the metrics being logged and the specific EC2 instance for the metrics.

4. Select the check box with CPUUtilization as the metric name for the EC2 instance of your choice. In this example Stress Test EC2 instance

5. In the left navigation pane, choose the Alarms dropdown list, and then choose All alarms.

You now create a metric alarm. A metric alarm watches a single CloudWatch metric or the result of a math expression based on CloudWatch metrics. The alarm performs one or more actions based on the value of the metric or expression relative to a threshold over a number of time periods. The action then sends a notification to the SNS topic that you created earlier.

6. Choose Create alarm.

7. Choose Select metric, choose EC2, and then choose Per-Instance Metrics.

8. Select the check box with CPUUtilization as the Metric name for the EC2 instance of your choice. In this case Stress Test instance name.

9. Choose Select metric.

10. On the Specify metric and conditions page, you can configure the following options:

Metric

  • Metric name: e.g STCPUUtilization
  • InstanceId: Leave the default option selected.
  • Statistic: e.g Average
  • Period: From the dropdown list, e.g., 5 minutes.

Conditions

  • Threshold type: Choose Static
  • Whenever CPUUtilization is…: Choose Greater > threshold.
  • than… Define the threshold value: Enter 60

11. Choose Next.

12. On the Configure actions page, configure the following options:

Notification

  • Alarm state trigger: Choose In alarm.
  • Select an SNS topic: Choose Select an existing SNS topic.
  • Send a notification to…: Choose the text box, and then choose MyAlarm.

13. Choose Next, and then configure the following options:

Name and description

  • Alarm name: Enter CPUUtilizationAlarm
  • Alarm description — optional: Enter CloudWatch alarm for Stress Test EC2 instance CPUUtilization

4. Choose Next

15. Review the Preview and create page, and then choose Create alarm.

In this task, you viewed some Amazon EC2 metrics within CloudWatch. You then created a CloudWatch alarm that initiates an In alarm state when the CPU utilization threshold exceeds 60 percent.


3. Test the Cloudwatch alarm

In this task, you log in to any of your EC2 instance and run a command that stresses the CPU load to 100 percent. This increase in CPU utilization activates the CloudWatch alarm, which causes Amazon SNS to send an email notification to the email address associated with the SNS topic.

  1. You can SSH into your instance and apply the following command to sress the instance, i.e. to manually increase CPU load of the instance.

Click on this link to learn how to SSH into an EC2 instance

Remember we created an alarm that would be triggered if the CPU utilization goes beyond 60%
sudo stress - cpu 10 -v - timeout 400s        

This command runs for 400 seconds, loads the CPU to 100 percent, and then decreases the CPU to 0 percent after the allotted time.

2. Navigate back to the AWS console where you have the CloudWatch Alarms page open.

3. Choose CPUUtilizationAlarm.

4. Monitor the graph while selecting the refresh button every 1 minute until the alarm status is In alarm.

5. Navigate to your email inbox for the email address that you used to configure the Amazon SNS subscription. You should see a new email notification from AWS Notifications.

In this task, you ran a command to load the EC2 instance to 100 percent for 400 seconds. This increase in CPU utilization activated the alarm to go into the In alarm state, and you confirmed the spike in the CPU utilization by viewing the CloudWatch graph. You also received a email notification alerting you of the In alarm state.


4. Create a CloudWatch dashboard

Remember CloudWatch dashboards are customizable home pages in the CloudWatch console that you can use to monitor your resources in a single view. With CloudWatch dashboards, you can even monitor resources that are spread across different Regions. You can use CloudWatch dashboards to create customized views of the metrics and alarms for your AWS resources.

1. Go to the CloudWatch section in the AWS console. In the left navigation pane, choose Dashboards.

2. Choose Create dashboard.

3. Give a name to your Dashboard

4. You can choose from a variety of widgets, i.e Line, Number, Pie etc. Select the most suitable one for you.

5. Click Next

6. A variety of metrics can be displayed on your dashboard. Select the most suitable.

In this case I will select EC2 and proceed with Per-Instance metrics

7. Click Next

8. Click Create widget

9. Choose Save dashboard.

Now you have created a quick access shortcut to view the CPUUtilization metric for the instance.

Consclusion

In this article, we were able gain a deeper understanding on CloudWatch. Apart from the theortical work, the article went further to dive deep into hands-on practicals on how to:

  • Create an Amazon SNS notification
  • Configure a Cloudwatch alarm
  • Stress test an EC2 instance
  • Confirm that an Amazon SNS email was sent
  • Create a CloudWatch dashboard

Amazon CloudWatch is more than just a monitoring tool — it’s your window into the health, performance, and efficiency of your AWS environment. By providing real-time insights, automated alerts, and actionable data, CloudWatch empowers you to proactively manage resources, optimize costs, and ensure seamless application performance. Whether you’re tracking metrics, setting up alarms, or visualizing logs, CloudWatch simplifies the complexity of cloud operations.

As cloud environments grow, tools like CloudWatch become indispensable for maintaining visibility and control. Embrace CloudWatch to transform raw data into meaningful insights, and take your cloud management to the next level. With CloudWatch, you’re not just monitoring your cloud — you’re mastering it.

要查看或添加评论,请登录

STEVEN ODHIAMBO的更多文章

社区洞察

其他会员也浏览了