AWS Cloud Operations & Migrations Blog

Category: Management & Governance

Introducing AWS Audit Manager Common Controls Library

AWS Audit Manager introduced the AWS common controls library to help Governance, Risk and Compliance (GRC) teams efficiently map their enterprise controls into Audit Manager for evidence collection. The common controls library provides customers with a simpler way to collect evidence that supports overlapping controls across multiple compliance standards, streamlining the evidence collection process, reducing […]

Getting started with myApplications for Terraform-managed applications

AWS customers often operate hundreds of applications and have to monitor and manage individual resources to make sure their applications are available, secure, cost-optimized, and performing optimally. In this blog post, we will walk through how to use Terraform to create an application for use with myApplications, add resources to new and existing applications, and strategies for scaling application management using Terraform.

Centralize observability with Amazon Managed Grafana Enterprise plugins

Observability is a critical aspect for maintaining the health and performance of any distributed system. Organizations rely on data from diverse sources, including AWS services as well as third-party ISVs (independent software vendor) to gain insights into their system’s health. Establishing secure connections to these diverse data sources enables visualization and analysis of observability data […]

Using Permissions to Unlock Resilience with AWS Resilience Hub

AWS customers come to AWS Resilience Hub for the ability to assess their application against their Recovery Time Objectives (RTO), the maximum acceptable time an application can be in a disrupted state, and Recovery Point Objectives (RPO), the maximum amount of data that can be lost due to disruption. Although customers come for the assessment […]

Use Amazon CloudWatch Contributor Insights for general analysis of Apache logs

Customers build, deploy, and maintain millions of web applications on AWS and many customers deploy these applications using the Apache web application server. Web application performance is a key metric in modern enterprise applications. On AWS customers leverage Amazon CloudWatch to monitor response times, uptime, and provide SLAs. Engineering teams that run large scale applications […]

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Some of my customers need assistance while operating their Amazon Elastic Compute Cloud (Amazon EC2) infrastructure. They need to: Review the disk usage of various volumes/ disks within an EC2 instance. To do it in a scalable way, one does not need to access the instance either through a Remote Desktop Session (RDP) or use […]

Resiliency Journey : exploring how AWS Resilience Hub and Migration Acceleration Program come together

In today’s rapidly evolving digital landscape, the cloud has become the backbone of innovation, scalability, and efficiency for businesses worldwide. As customers embark on their cloud migration journeys, whether the migration has been motivated by the intention of accelerating innovation, reducing operational and infrastructure costs, or exiting your on-prem datacenter, migrating to the cloud presents […]

Automate CloudWatch Dashboard creation for your AWS Elemental Mediapackage and AWS Elemental Medialive

Introduction Monitoring the health and performance of your media services is critical to ensuring a seamless viewing experience for your customers. Amazon CloudWatch provides powerful monitoring capabilities for Amazon Web Services (AWS) resources. Setting up comprehensive dashboards can be a time-consuming process, especially for organizations managing large number of resources across multiple regions. The Automatic CloudWatch […]

Ten Ways to Improve Your AWS Operations

Introduction When I take my car in for service for a simple oil change, the technician often reads off a litany of other services my car needs that I had put off since the previous service (and maybe the service before that, too). I tend to wait for the “check engine” light to come on […]

How SLAs, SLOs, and SLIs interact

Improve application reliability with effective SLOs

At AWS, we consider reliability as a capability of services to withstand major disruptions within acceptable degradation parameters and to recover within an acceptable timeframe. Service reliability goes beyond traditional disciplines, such as availability and performance, to achieve its goal. Components of a system or application will eventually fail over time. Like our CTO Werner Vogels […]