Blog Amazon Web Services AWS Certification : Troubleshooting and Monitoring EC2 instances using CloudWatch

AWS Certification : Troubleshooting and Monitoring EC2 instances using CloudWatch

Are you preparing for AWS Certified SysOps Administrator – Associate certification exam?  Are you ready to pass this exam? In this blog, we are writing a series of articles on topics which are covered in the AWS certified SysOps associate certification exam. You can subscribe to us for receiving further updates on this topic.

The SysOps Associate certification exam is the hardest exam in the associate level. We would recommend you pass both solution architect associated certification exam and developer associated certification exam first before of taking take this exam.

How to troubleshoot and monitor using CloudWatch?

The AWS Certified SysOps Administrator – Associate exam validates technical expertise in deployment, management, and operations on the AWS platform

The AWS Certified SysOps Administrator – Associate Level exam validates the candidate’s ability to:

  • Deliver the stability and scalability needed by a business on AWS
  • Provision systems, services and deployment automation on AWS
  • Ensure data integrity and data security on AWS technology
  • Provide guidance on AWS best practices
  • Understand and monitor metrics on AWS

figure0_sysopsblueprint

Figure #0.  Domains covered at the AWS Certified SysOps associate exam

You can download the related AWS Certified SysOps Administrator – Associate Level Exam Blueprint for more detail about it.

In this article, we are going to explain about the topic that addresses the Monitoring availability and performance of EC2 instance as highlighted in the AWS Blueprint from the above exam guide.

Context

As part of management and operation of infrastructure provided on the AWS platform, a SysOps administrator must guarantee the availability and performance of the EC2 instances by monitoring and troubleshooting any error present, avoiding their occurrence.

For facilitate their labor, AWS offers a service called Cloudwatch.

What is AWS CloudWatch?

CloudWatch is the monitoring service for AWS resources and applications that are running into AWS. With this service, you can collect and track data, collect and monitor log information and establish alarms for troubleshooting issues related to AWS resources.

You can use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance, and operational health. Metrics are provided automatically for a lot of AWS services, including Amazon EC2 instances, EBS volumes, Elastic Load Balancers, Auto Scaling groups, RDS instances, DynamoDB tables, ElastiCache clusters, RedShift clusters, OpsWorks stacks, Route 53 health checks, SNS topics, SQS queues, SWF workflows, and Storage Gateways.

By default, each EC2 instance has enabled basic monitoring, that you can review in the Monitoring tab of your Amazon AWS EC2 console.

At Basic monitoring, the data is collected each 5 minute (per datapoints), without cost for your company.
It collects basic metrics data like: CPU Utilization, Disk Read/Writes data and operations, Network In/Out data and packet counting, Status Checks and CPU Credit Usage and Balance. If you want measure other things like Memory usage (RAM utilization), you need develop a custom metric using CloudWatch.

The retention period of metrics in the basic monitoring mode are datapoints each 5 minutes available for 63 days.

figure1_basicmonitoring

Figure #1. Basic monitoring per each EC2 instance


Also, you can enable Detailed monitoring, this mode represents an additional cost to you and interval for collecting datapoints each 1 minute. All data recorded are stored during 14 days by default, even if the EC2 has been terminated.

How to enable detailed Monitoring?

To enable detailed monitoring for an existing EC2 instance using the AWS console you can do the following steps:

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/ .
  2. In the navigation pane, choose Instances.
  3. Select the instance, choose Actions, CloudWatch Monitoring, Enable Detailed Monitoring.
  4. In the Enable Detailed Monitoring dialog box, choose Yes, Enable.
  5. Choose Close.

To automate your duties, you can set alarms on any of your metrics to receive notifications or take other automated actions when your metric crosses one specified threshold. You can use alarms to detect and shut down Amazon EC2 instances that are unused or underutilized for contributing to reduce their cost. You can configure alarm actions to stop, start, or terminate an Amazon EC2 instance when certain criteria are met like CPU Utilization has a lower/higher utilization to the expected performance.

The alarm actions could be sending a notification to an Amazon Simple Notification Service topic (SMS, Email, and HTTP end point, notifying the Auto Scaling policy or changing the state of the instance to Stop/Terminate.

 figure2_defineanalarm

Figure #2. Define a new alarm for CPU Utilization


About EC2 Instance Monitoring

With instance status monitoring, you can quickly determine whether Amazon EC2 has detected any problems that might prevent your instances from running applications. Amazon EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.

figure2_instancecheck

Figure #3. Detecting status checks

You must differentiate System Status checks from Instance Status checks, the 1st one checks the underlined physical host and the last one checks the virtual machine (VM) itself. If a failure occurs into the hardware or virtual machine around your EC2 instance, those alerts could help you for troubleshooting the relative problem.
A problem related to loss of network connectivity, loss of system power, software or hardware issues on the physical host could be reflected by the System Status Checks. For troubleshooting, any of those issues you should re-initiate your VM.

A problem related to a corrupted file system, incompatible kernel, misconfigured networking or startup configuration or exhausted memory could be reflected by the Instance Status Checks. For troubleshooting, any of those issues you should restart their instance or make modifications in their operating system.

You also, can create status check alarms for an existing EC2 instances to monitor instance status or system status.

If you as SysOps, automatize many operations actions, you could avoid and reduce time for resolving issues. Remember, always monitor and automate alarm actions into your EC2 instances and you can receive multiples benefits.


Important Points to Remember for the AWS Certified SysOps Administrator – Associate Certification exam

  • CloudWatch is the monitoring service for AWS resources and applications that are running into AWS
  • By default, your EC2 instance is enabled for basic monitoring
  • In basic monitoring the data is captured each 5 minute by free
  • In detailed monitoring the data is captured each 1 minute at additional cost
  • You can use alarms to detect and shut down Amazon EC2 instances
  • You can simulate an alarm using the AWS CLI
  • The System status check verifies the underlined physical host
  • The instance status check verifies the virtual machine (VM) itself
  • You have to develop custom metric for other measures like ram usage

 

Glossary

Term Brief description
Basic Monitoring Data is available automatically in 5-minute periods at no charge. 
Detailed Monitoring Data is available in 1-minute periods for an additional cost. To get this level of data, you must specifically enable it for the instance. For the instances where you’ve enabled detailed monitoring, you can also get aggregated data across groups of similar instances. 
Metric A metric represents a time-ordered set of data points that are published to CloudWatch
Custom metric Your own metrics published to CloudWatch using the AWS CLI or an API. CloudWatch stores data about a metric as a series of data points. 
Alarm An alarm watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time 
Time stamp Time stamps are date Time objects, with the complete date plus hours, minutes, and seconds


Table #1. Important Terms

Summary

In this article, we have explained about how to monitor and troubleshooting EC2 instances using CloudWatch. We have described the AWS CloudWatch service, how to create alarms and relevant points related to the AWS CloudWatch service that needs to be remembered for the exam. CloudWatch is the monitoring service for AWS resources and applications that are running into AWS.

If you are preparing for the AWS certifications exam and looking for any help, please send us a mail to call to our customer support team.

References:

[1] CloudWatch Faqs. https://aws.amazon.com/cloudwatch/faqs/
[2] Amazon CloudWatch Documentation.
https://aws.amazon.com/documentation/cloudwatch/
[3] AWS Certified SysOps Administrator – Associate Certification.
https://aws.amazon.com/certification/certified-sysops-admin-associate/

Spread the love

LEAVE A REPLY

Please enter your comment!
Please enter your name here