Articles by Tag #sre

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

A Very Deep Dive Into Docker Builds

Containers are everywhere. From Kubernetes for orchestrating deployments and simplifing operations to...

Learn More 46 1Nov 26 '24

DynamoDB: Query x Scan! Para de torrar dinheiro usando Scan em produção

Como Economizei 50 mil dólares com DynamoDB! Eu já consegui economizar 50 mil dólares, só...

Learn More 38 6Oct 28 '24

10 kubectl Plugins That Help Make You the Most Valuable Kubernetes Engineer in the Room

Kubernetes is insanely powerful and becomes much easier to manage when you extend kubectl with...

Learn More 32 2May 29

💡 Build Along with Me: A Beginner’s Guide to Creating a Student API Using Flask

Today on my journey to gaining DevOps Mastery, I Built a Student REST API with Flask. In this guide,...

Learn More 25 0Apr 12

How to Score 93% in the Prometheus Certified Associate Exam

Introduction Passing technical certifications often feels daunting and intimidating. The...

Learn More 15 1May 7

AWS Observability Maturity Model - V2

Approximately six months ago, I introduced the first version of the AWS Observability Maturity Model....

Learn More 13 0Sep 14 '24

Troubleshoot Container OOM Kills with eBPF

So I had a little time on my hands a couple weeks ago and decided to explore how to use eBPF to...

Learn More 12 4Jun 16

Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]

O que é o Argo Rollouts? Argo Rollouts é um controlador para Kubernetes e um conjunto de CRDs que...

Learn More 12 0Aug 10

Mastering Kubernetes: Become a Pro in K8s Deployments

Kubernetes is a game-changer for managing containerized applications, but mastering it requires more...

Learn More 11 0Mar 2

No More Surprises: Get Notified on Terraform Deprecations

This is a submission for the Runner H "AI Agent Prompting" Challenge What I Built...

Learn More 10 1Jun 28

Detecting nginx worker leaks

A good way to understand how software systems work is through studying how they fail. It's why the...

Learn More 9 2Jun 3

Error Budgets in Practice: A Data-Driven Approach to Risk and Release Management

Why Error Budgets? CoinGecko offers API services to our customers. There are 2 types of...

Learn More 9 0Jan 20

Designing a fault-tolerant etcd cluster on AWS

Introduction In this article, we are going to discuss a strongly consistent, distributed...

Learn More 8 1Nov 4 '24

How a Pod is Deleted - Behind the Scenes Breakdown

When we run kubectl delete pod , the confirmation message pops up saying the pod is deleted(if all...

Learn More 8 2Oct 25 '24

AIOps Powered by AWS: Developing Intelligent Alerting with CloudWatch & Built-In Capabilities

AIOps is no longer the next big thing — the journey has already started, and you need to get on board...

Learn More 8 0Jan 5

DevOps Made Simple: A Beginner’s Guide to Self-Healing Systems in DevOps

Introduction In the fast-paced world of DevOps, system failures are inevitable. However,...

Learn More 7 0Mar 18

DevOps, SRE, or Platform Engineer? How to Know Which Role Fits You

If you’ve ever browsed tech job boards, you’ve probably seen titles like DevOps Engineer, Platform...

Learn More 6 2May 18

Introducing Botkube Fuse: The Platform Engineer’s Copilot

The daily grind of platform engineers entails constant switching between tools, handling...

Learn More 6 0Sep 3 '24

10 Open Source Tools for Observability Every DevOps Engineer Should Know

Hey friends! If you’re working with cloud systems or microservices, you know how important it is to...

Learn More 6 0May 21

6 Best Free OnCall Software in 2024, Open-Source and SaaS

Discover the 6 best free OnCall software in 2024, including open-source and SaaS solutions, to enhance your incident management process and ensure 24/7 reliability.

Learn More 6 0Aug 28 '24

🔁 Rollback in DevOps: Why Every Deployment Needs a Safety Net

Ever deployed code to production only to watch everything catch fire? You're not alone. Let's talk...

Learn More 6 2May 29

How to Write Effective Incident Post-Mortems: A Complete Guide

Learn how to write incident post-mortems that drive real improvements. Discover templates, best practices, and tips for better incident management.

Learn More 6 0Jul 16

Understanding the 0.6-Second Detection Time for Full Outages

If you’ve explored the widely-read workbook on Site Reliability Engineering (SRE), you might have...

Learn More 6 0Sep 14 '24

7 Kubernetes Security Best Practices in 2024

Kubernetes (K8S) has revolutionized software development, but managing such a complex system with...

Learn More 6 0Oct 29 '24

Kubernetes Node Affinity and Anti-Affinity: Scheduling Workloads effectively

Kubernetes, a robust container orchestration system, empowers developers with advanced scheduling...

Learn More 6 0Jan 27

Some of the less-known ping types you should know

Normal Ping output: > ping 192.168.0.100 PING 192.168.0.100 (192.168.0.100): 56 data bytes 64...

Learn More 6 1Oct 25 '24

2x Faster, 40% less RAM: The Cloud Run stdout logging hack

Sometimes, the simplest solutions yield the most dramatic improvements. In a recent private project,...

Learn More 6 0Nov 24 '24

Bring third-party incidents into Better Stack

Connect IsDown to Better Stack and see third-party service outages as incidents in your existing on-call queue, with auto-resolve and live status updates.

Learn More 5 1May 5

Why Kubernetes No Longer Runs with Docker – Here’s the Reason

Let’s clear up a common confusion: Yes, Kubernetes can run without Docker. But to understand how, we...

Learn More 5 0May 14

🚀 Day 8: Mastering Shell Scripting in DevOps | Bash Challenge

Welcome back to Day 8 of the #90DaysOfDevOps Challenge! Today, we're diving deep into the basics of...

Learn More 5 1Nov 14 '24