10 Best Practices for API Rate Limiting
Arunangshu Das

Arunangshu Das @arunangshu_das

About: Software Developer

Location:
India
Joined:
Mar 20, 2025

10 Best Practices for API Rate Limiting

Publish Date: Jul 3
0 0

If you’ve ever built an API for your SaaS, mobile app, or backend services, you’ve probably been hit by the consequences of not having proper rate limiting in place. Whether it's a rogue script flooding your endpoints or an unexpected spike from a viral app update, uncontrolled API traffic can cripple your systems.

And it doesn’t just end with downtime. You get:

  • Angry users.
  • Costly infrastructure bills.
  • A damaged brand reputation.

What is API Rate Limiting?

Rate limiting is the process of restricting the number of API requests a user or client can make in a given timeframe.

For example:

  • 100 requests per minute
  • 1000 requests per day
  • 10 login attempts every 15 minutes

This prevents abuse, ensures fair usage, and protects backend resources from overload.

It’s especially crucial when you're hosting on cloud platforms. Even if your infrastructure scales automatically, uncontrolled traffic means uncontrolled costs.

Let’s talk about how to do it right.

1. Set Smart and Granular Limits

Setting a flat limit for everyone is simple, but not smart.

Instead, tailor your limits:

  • Per user or API key
  • Per IP address
  • Per route (e.g., /login vs /get-profile)
  • Per subscription plan (Free, Pro, Enterprise)

Real-world tip: Limit high-impact endpoints like login or payment routes more strictly to avoid brute-force attacks or fraud.

Example:

app.use('/api', rateLimit({
  windowMs: 60 * 1000,
  max: (req, res) => {
    return req.user.plan === 'pro' ? 1000 : 100;
  }
}));
Enter fullscreen mode Exit fullscreen mode

When you’re on a scalable platform like Cloudways, this level of control allows you to allocate resources wisely without overpaying for infrastructure.

2. Use Sliding Window or Token Bucket Algorithms

Basic fixed-window rate limiting has loopholes. For instance, a user can send 100 requests at the end of a window and 100 more right after it resets.

Better approaches:

  • Sliding Window Log – Checks request timestamps for more accurate control.
  • Leaky Bucket / Token Bucket – Allows short bursts but smooths out traffic over time.

These algorithms balance fairness and flexibility. You don’t frustrate users for minor spikes, but still guard against abuse.

Popular libraries like rate-limiter-flexible in Node.js support these strategies. They also integrate easily with Redis, making them fast and distributed.

3. Give Clients Feedback

Don’t just block requests—communicate!

Send proper HTTP headers:

  • X-RateLimit-Limit – total allowed
  • X-RateLimit-Remaining – requests left
  • X-RateLimit-Reset – time when the limit resets

Also, respond with:

429 Too Many Requests
Retry-After: 60
Enter fullscreen mode Exit fullscreen mode

Good feedback helps devs adjust their apps, automate retries, and avoid frustration.

If you’re building public APIs or SDKs, adding this is a must.

4. Log and Monitor Rate Limit Activity

You don’t know what you don’t track.

Make sure every rate-limited request is logged. Track:

  • Endpoint
  • IP address or user ID
  • Timestamps
  • Count of rejected requests

Send them to your monitoring platform—Datadog, Sentry, or even a simple logging service. With Cloudways, integrating real-time log management tools is seamless and resource-efficient.

Pro tip: Set up alerts for suspicious patterns (like repeated 429s from a single IP or bot).

5. Whitelisting and Special Exceptions

Some systems or services shouldn’t be rate limited.

Examples:

  • Your internal monitoring services
  • Partner integrations
  • Admin dashboards
  • Critical cron jobs

Create a whitelist mechanism in your rate limiter:

if (isWhitelisted(req.user)) return next();
Enter fullscreen mode Exit fullscreen mode

Be careful: Always log exceptions and review them regularly. Abuse can sneak in through whitelisted backdoors.

6. Geo-Based Rate Controls (Underused But Powerful)

Different geographies behave differently:

  • Some have slower networks
  • Some are prone to spam bots
  • Some countries have regulatory limits on how services can behave

Using IP geolocation data, you can fine-tune limits.

For example:

if (geo.country === 'IN') limit = 500;
else if (geo.country === 'RU') limit = 100;
Enter fullscreen mode Exit fullscreen mode

Platforms like Cloudways allow geographic load balancing and edge caching, so combining this with API-level control gives you a huge performance edge.

7. Rate Limit by Endpoint Sensitivity

Not all API endpoints are equal.

  • /get-products can be called often.
  • /create-order should be limited.
  • /delete-user needs extra control and logs.

Customize rate limits per route:

app.use('/create-order', rateLimit({ max: 10, windowMs: 60 * 1000 }));
app.use('/get-products', rateLimit({ max: 200 }));
Enter fullscreen mode Exit fullscreen mode

Fine-grained control = better user experience without compromising backend safety.

8. Implement User-Friendly Retry Behavior

Just blocking clients is harsh. Encourage retry logic:

  • Backoff delays (e.g., retry after 5s, then 10s, then 20s)
  • Retry headers (Retry-After)
  • JSON error responses like:
{
  "error": "Too many requests",
  "retry_after": 30
}
Enter fullscreen mode Exit fullscreen mode

Make sure your SDKs or frontend apps gracefully handle 429 responses.

If you’re running your backend behind a platform like Cloudways, you can use NGINX or Varnish at the edge to throttle traffic before it reaches your app, saving both CPU cycles and user frustration.

9. Distribute Rate Limiting Across Servers (Use Redis or Memcached)

If your API is hosted across multiple nodes, local memory won’t work for rate limiting—it won’t sync limits across instances.

Use a shared data store like:

  • Redis – Fast, atomic operations, ideal for rate limiting
  • Memcached – Simple key-value, works well for basic scenarios

🔧 Popular tools like express-rate-limit, rate-limiter-flexible, or throttled support Redis out of the box.

Cloudways makes it easy to plug in Redis or Memcached on your server with just a few clicks. You don’t have to manage them manually—let the platform do the heavy lifting.

10. Scale Rate Limiting with Usage Tiers

If your API is part of a SaaS product, monetize usage with API tiers.

For example:

  • Free plan: 100 requests/day
  • Pro plan: 10,000 requests/day
  • Enterprise: Unlimited, with SLAs

Build logic that reads user plan and sets limits accordingly.

Also, expose usage stats in the user dashboard:

{
  "daily_limit": 10000,
  "used": 3871,
  "resets_in": "3h 24m"
}
Enter fullscreen mode Exit fullscreen mode

This encourages upgrades and gives users transparency. Platforms like Cloudways can help you scale this quickly and securely—just deploy a Node.js backend with MongoDB or PostgreSQL, and you’re good to go.

Bonus Tip: Combine Rate Limiting with API Gateway Rules

If your traffic is large or globally distributed, consider fronting your API with an API Gateway like:

  • NGINX
  • Cloudflare Workers
  • AWS API Gateway

Gateways offer:

  • Request inspection
  • Geo-based throttling
  • Custom rule sets
  • WAF (Web Application Firewall) integration

If you’re hosting on Cloudways, using Cloudflare in front of your servers is a common and easy-to-implement strategy. You get rate limiting, caching, and DDoS protection all in one shot.

Final Thoughts: Don’t Just Rate Limit—Do It Right

Rate limiting isn’t a checkbox—it’s a crucial layer in your architecture. When done right, it:

  • Keeps your services stable
  • Defends against spam and bots
  • Protects your infrastructure budget
  • Adds polish to your API UX

But to implement it well, you need:

  • Control
  • Observability
  • Scaling flexibility

That’s why I highly recommend testing and deploying your API infrastructure on a managed, developer-friendly cloud platform like Cloudways. Their ability to handle autoscaling, Redis integration, NGINX preconfigured support, and secure deployments makes adding smart rate limiting a breeze—without the headaches of server management.

In fact, most of the production APIs I’ve built or helped scale are running smoothly today because we got rate limiting right from day one.

TL;DR — 10 Best Practices Recap

  1. Set smart per-user/IP/route limits
  2. Use advanced algorithms like token bucket
  3. Give clear feedback with headers and status codes
  4. Log and monitor rejected requests
  5. Whitelist internal and partner apps
  6. Use geo-based limits for smarter control
  7. Tailor limits by endpoint sensitivity
  8. Support retries and backoff mechanisms
  9. Use distributed stores like Redis
  10. Scale limits by user plan and expose usage stats

And if you’re still deploying on shared hosting or traditional VPS, it might be time to give platforms like Cloudways a spin. Once you experience the freedom of painless scaling and integrated monitoring, you won’t want to go back.

You may also like:

  1. 5 Benefits of Using Worker Threads in Node.js

  2. 7 Best Practices for Sanitizing Input in Node.js

  3. 5 AI Developer Tools to Double Your Coding Speed

  4. 10 Essential Steps to Organize Node.js Projects on Cloudways

  5. What is GeoIP Rate-Limiting in Node.js on Cloudways?

  6. 6 Common Misconceptions About Node.js Event Loop

  7. Deploy a Node.js App on Cloudways in 10 Minutes

  8. 5 Reasons to Deep Copy Request Payloads in Node.js

  9. 5 Essential Tips for Managing Complex Objects in JavaScript

  10. 7 API Best Practices Every Backend Developer Should Follow

Read more blogs from Here

You can easily reach me with a quick call right from here.

Share your experiences in the comments, and let’s discuss how to tackle them!

Follow me on LinkedIn

Comments 0 total

    Add comment