📊 New: Amazon CloudWatch Agent Now Supports Detailed EBS Performance Metrics (June 2025)
Latchu@DevOps

Latchu@DevOps @latchudevops

About: Infra. Automation. Impact

Location:
Chennai, India
Joined:
Apr 10, 2025

📊 New: Amazon CloudWatch Agent Now Supports Detailed EBS Performance Metrics (June 2025)

Publish Date: Jun 10
0 2

Good news for developers, SREs, and cloud engineers — Amazon CloudWatch Agent now supports collecting detailed performance statistics for EBS volumes attached to EC2 and EKS nodes.

This means you can finally monitor and troubleshoot your EBS storage like a pro — with visibility into NVMe-level metrics such as:

  • 🔁 IOPS (read/write operations)
  • 📦 Throughput (bytes read/written)
  • ⏱️ I/O wait time
  • 🎯 Queue depth

Let’s break it down with a real-world example.


🔧 Use Case: App is Slow, But CPU & RAM Look Fine?

You’re running a production web app on EC2 with a gp3 EBS volume.
The app gets sluggish during peak hours, but CloudWatch shows:

  • CPU: fine
  • Memory: fine
  • Network: fine

Now, thanks to the new update, you can collect EBS disk-level metrics and discover the real problem.


🧪 Step-by-Step Example

Step 1: Enable EBS Metrics in CloudWatch Agent

Update your amazon-cloudwatch-agent.json config:

{
  "metrics": {
    "metrics_collected": {
      "diskio": {
        "resources": ["*"],
        "measurement": [
          "reads", "writes", "read_bytes", "write_bytes",
          "io_time", "await", "util", "queue"
        ],
        "metrics_collection_interval": 60
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Then restart the agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config -m ec2 -c file:/path/to/config.json -s
Enter fullscreen mode Exit fullscreen mode

Step 2: View in CloudWatch

You'll now see custom metrics like:

  • await → time the app waits for I/O
  • queue → how many I/O ops are waiting
  • io_time → total time EBS spends on operations
  • read_bytes, write_bytes → data throughput

Step 3: Analyze & Act

During peak load:

  • queue = 22 (too high)
  • await = 120ms (delays noticeable)
  • write_bytes drops sharply

🧠 Root cause: EBS is bottlenecked. Time to provision more IOPS or switch from gp3 to io2.


✅ Why This Matters

Benefit Impact
Granular storage insights Understand app latency at disk level
Real-time metrics Catch slowdowns before users do
Automation ready Build alarms & dashboards
Works with EC2 + EKS Great for both VMs & containers

🧾 TL;DR

🚀 CloudWatch Agent now supports:

  • NVMe-based EBS performance metrics
  • Queue depth, IOPS, throughput, and more
  • Alarms, dashboards, and smarter diagnostics

No more guessing — now you can see and solve storage bottlenecks confidently.


Have you started using EBS metrics in your monitoring stack?
Drop your setup or questions in the comments

Comments 2 total

Add comment