Understanding Measures of Central Tendency in Data Science
Naomi Jepkorir

Naomi Jepkorir @datawithnaomi

About: Aspiring data scientist diving deep into all things, data-analytics, ML and storytelling through numbers📊. Sharing my journey, projects and what I'm learning along the way👾.

Location:
Kenya
Joined:
May 9, 2025

Understanding Measures of Central Tendency in Data Science

Publish Date: Jul 20
2 0

When you think of "mean", "median" or "mode", chances are your brain flashes back to a math class you didn’t think you'd ever use again. 😅

But here I am ,knee-deep in datasets, and those three little words keep showing up. Not just as formulas, but as powerful tools that help tell the story behind the numbers.

This post is part of my continued journey into data science. After exploring tools like Excel,power Bi I started digging into core concepts - and measures of central tendency are some of the first I’ve truly appreciated in the real world.

Let’s break it down in plain English 👇

What Are Measures of Central Tendency? 🤔

Measures of central tendency help us understand the “center” or “typical” value in a dataset. Basically, they summarize what’s "normal" in your data, and that's a huge help when you’re making sense of hundreds (or millions) of numbers.

The three most common ones are:

  1. Mean - the average value

  2. Median - the middle value

  3. Mode - the most frequently occurring value

They each tell you something slightly different, and choosing the right one depends on the situation.

Why Do They Matter in Data Science? 🎯

When you're working with data, you're usually trying to:

  • Understand trends

  • Compare groups

  • Make decisions

  • Build predictive models

Measures of central tendency give you a quick pulse check on your dataset. For example:

  • If you’re analyzing income data, the median might be better than the mean because of outliers (like billionaires).

  • If you're reviewing customer ratings from 1 to 5 stars, the mode could show you the most common sentiment.

  • If your data is pretty clean and normally distributed, the mean gives a solid summary.

Real-World Examples 🔍

Here are a few situations where these measures pop up:

  1. 📈 Business Reporting
    Companies use the mean to summarize average sales, costs or customer satisfaction scores over time.

  2. 🏥 Healthcare
    Hospitals might use the median to report wait times, since a few extreme cases can skew the average.

  3. 🛍️ Retail and Marketing
    The mode helps track the most popular product sizes, colors or price points.

A Quick Python Example 🐍

If you’ve got a list of numbers, you can calculate all three super easily:

import statistics

data = [1, 2, 2, 3, 4, 4, 4, 5, 6]

mean = statistics.mean(data)     # 3.44
median = statistics.median(data) # 4
mode = statistics.mode(data)     # 4

print(mean, median, mode)

Enter fullscreen mode Exit fullscreen mode

These tiny lines of code can give you a huge amount of insight.

My Reflection 💭

At first, I thought central tendency was just for passing stats exams. Now, I see it as one of the first things you should check when exploring a new dataset. It gives you a quick overview, helps spot data issues and sets the stage for deeper analysis or modeling.

Plus, it’s foundational. Whether you're in Excel, Python or SQL, you'll use these concepts everywhere.

If you're just getting started in data science like I am, don't overlook the basics. They’re called “central” for a reason. 😉

Comments 0 total

    Add comment