How I Host a Bot in 45,000 Discord Servers For Free

When I talk to other developers about my Discord bot they tend to be surprised that I operate it for free using widely available free tiers on cloud platforms, despite it being in nearly 45,000 servers.

So I thought I would write an article detailing how that works. I'll reveal where I host it, early technical decisions that made the bot lean, optimizations I've made, and what I could do better.

Where it's hosted 🖥

Of course in order to host a bot for free, one needs, drum roll please 🥁, free hosting.

The bot is hosted in Oracle Cloud. Their free tier allows you to create a VPS with up to 24 GB of RAM and 4 vCPUs, which is pretty wild compared to most other free tiers at similar providers, which offer more like 1 GB. Note that only ARM cores are available for this.

I'm currently using 18 GB of RAM for the bot server and 3 vCPUs. Not only is this sufficient, it's actually pretty overkill, and I expect I could scale the bot to at least 150k servers without any changes.

Although ARM cores might sound like a negative, the benchmarks I ran before moving the bot here actually showed the Oracle ARM cores outperforming Intel and AMD cores at Digital Ocean and Vultr (even their "high frequency" and "high performance" offerings), at least on the SQLite and image rendering benchmarks I used.

The ARM cores also look very cost effective even if you're paying. I've been pretty impressed with Oracle's offering honestly. Downtime has been minimal as well, I only remember one extended downtime of around 4 hours in the 2 years that the bot has been there.

One caveat is if you browse Reddit and such, you'll find some stories of Oracle randomly shutting free tier users down. I actually upgraded to a paid account and I pay for a small amount (like $2/month) of object storage for personal file backups using Duplicati, so that might get me onto the nice list. In the event, I haven't had any problems personally.

Cloud Run Functions ☁️

I don't need to do this anymore like I used to, since the free Oracle instance is so overkill, but a couple of the bot's features are deployed as Google Cloud Run functions (it's Google's answer to AWS Lambda).

Both of the features deployed this way use a few hundred megabytes of RAM to hold dictionaries in memory. By deploying those features as Cloud Run functions, the main bot server doesn't need to pony up the RAM for that. That all gets spun up on a separate server somewhere else in the magical Google cloud, and the main bot server just makes an HTTP request to the Cloud Run function.

So I basically use Cloud Run functions as extra free RAM for these memory-intensive features (and free CPU cycles too, although that's less critical since I've always had more CPU headroom). These features can feel more sluggish sometimes due to the phenomenon of cold starts but that's not a deal breaker.

Google Cloud has a very generous free tier that easily covers all of my usage of Cloud Run functions.

Logging and Error Reporting ⚠️

Logs are sent to Google Cloud Logging, and error level logs automatically flow to Google Cloud Error Reporting from there. Once again, this is all covered under the free tier, with room to spare.

Additionally, a separate bot in my support server allows moderators to trigger PagerDuty, which will call my cell phone and wake me up from my beauty sleep if any exceptionally serious situation arises. Fortunately, this has never needed to be used. As you may have guessed, this is covered under PagerDuty's free tier.

Language and Runtime </>

The bot is built with JavaScript and runs with Node.js. While JavaScript is not a particularly fast language in general, a lot of the bot's heavy lifting plays nicely to the strengths of Node.js. A lot of it happens in highly optimized native code within the V8 engine powering Node.js.

For example JSON parsing (which the bot needs to perform for every event it receives from Discord) is implemented in C++ in the V8 engine, and is very fast. The cost of then throwing the parsed object over the fence into JavaScript land is relatively minimal, and the work done on it by JavaScript code is pretty minimal as well in most cases (we'll get more into that later).

Node.js probably isn't the optimal choice here, and a bot written in C++ or Rust would likely perform better (depending on the bot library), but Node.js is likely going to beat other popular choices including Python, Java, and C#, and may even give Go a good run for its money. (Sorry, I don't have benchmarks, that's just my educated guess)

Library 📕

Choosing a bot library is just as important, maybe even more important, than choosing a language.

Background: The Discord API consists of a websocket API (for receiving events in real time) and a REST API (for performing actions). Neither are trivial to use, especially the former. Fortunately, there are community-maintained libraries for most popular programming languages to simplify interactions with the Discord APIs.

Bot developers almost always choose one of these libraries rather than trying to reinvent the wheel. When I made this choice way back in 2016, there were two main options for Node.js: discord.js and Eris.

discord.js was known as being more feature-rich, having great documentation, and having a welcoming community, while Eris was known for being much faster and leaner, having a somewhat more gatekeepy community, and being used by most of the popular Node.js bots who needed its vertical scalability.

At the time, I didn't have any ambitions of scaling my bot to tens of thousands of servers, but a lot of the bots I knew and respected used Eris, and I wasn't intimidated by its reputation since I wasn't a beginner. So I ended up choosing Eris, which was definitely the right choice in hindsight.

Nowadays, discord.js has improved its performance to an extent, while Eris has fragmented due to the main maintainers apparently moving on. My bot is using Dysnomia now which is a fork of Eris that aims to carry the torch onward, but that's a pretty niche setup by this point.

Library Optimizations ➕

The nature of the Discord API requires keeping a lot of information in a cache. For example, after your bot connects to the websocket API and an event happens involving user A, Discord will send your bot miscellaneous information about user A such as their username, avatar, account creation date, and more. Later, when another event involving user A occurs, Discord might not send you that extra information, under the assumption that you cached it previously.

This cached data is managed by whichever bot library you chose, and is generally kept in the bot process's memory. This cached data can easily get into the multi-gigabyte range for bots in thousands or tens of thousands of servers.

How a bot's library manages this cache data is a major factor in how well the bot scales. Eris (and by extension Dysnomia) is relatively efficient at this compared to other bot libraries. On the other hand, its cache customization options aren't very comprehensive. One option it does provide is allowing you to customize the message cache size, and this is a pretty significant one. I set this to zero in my bot, because my bot never needs to access previously-sent messages. If I remember correctly, the default is to cache the most recent 100 messages in each channel. For a bot in 45,000 servers, that's easily millions of messages and may add up to multiple gigabytes of memory.

Another important optimization is to only enable Gateway Intents that you actually need. If at all possible, avoid the PRESENCE_UPDATE intent like the plague. Having that intent enabled can massively increase the number of events your bot receives (you'll receive an event any time anybody's online status changes in any server). Enabling PRESENCE_UPDATE actually requires approval from Discord now if your bot is in more than 100 servers, so this is a decision most developers don't need to grapple with - Discord will force it off for you unless you really need it.

The Hot Path 🔥

The bot receives an event for every message sent in any server it's present in. That averages to around 100-200 events per second.

At least 99.9% of those messages aren't intended for the bot, so the bot just has to look at the message, decide "this isn't for me", and do nothing. The faster we can do that, the better. Here's some of the key code involved in that:

processInput(bot, msg) {
  let serverId = msg.channel.guild ? msg.channel.guild.id : msg.channel.id;
  let prefixes = this.persistence_.getPrefixesForServer(serverId);
  let msgContent = msg.content;

  msgContent = msgContent.replace('\u3000', ' ');
  let spaceIndex = msgContent.indexOf(' ');
  let commandText = '';
  if (spaceIndex === -1) {
    commandText = msgContent;
  } else {
    commandText = msgContent.substring(0, spaceIndex);
  }

  commandText = commandText.toLowerCase();

  for (let prefix of prefixes) {
    for (let command of this.commands_) {
      for (let alias of command.aliases) {
        const prefixedAlias = prefix + alias;
        if (commandText === prefixedAlias) {
          return this.executeCommand_(bot, msg, command, msgContent, spaceIndex, prefix);
        }
      }
    }
  }

  return false;
}

This code actually isn't very micro-optimized. Looking at it now, I can imagine how we could do better (mainly by avoiding the creation of new strings). But here's the important bit that we're getting right:

let prefixes = this.persistence_.getPrefixesForServer(serverId);

This is fetching the custom command prefixes for the current Discord server. If you don't know what that means, consider that by default all the bot's command are prefixed/namespaced with k!. For example: k!help, k!about, etc. But what if there's another bot that uses the same prefix, and it also has a k!help command? Fortunately, the bot allows server admins to change the k! prefix to something else to disambiguate (or if they just prefer using some other prefix).

This is a per-server setting that's saved in the database, but this code here to fetch it is synchronous, and that's the key optimization. All command prefixes are cached in process memory to avoid needing any inter-process communication in this hot path. Making hundreds of calls per second to a database or cache instance, while possible, would add significant load and would concretely impact scalability.

Not all bots can avoid database queries in their hot paths. If you have a bot that grants users XP for every message they send, then you have to make a database query, there's no way around it (although you could batch many XP increments together and flush them to the database in groups).

What Could I Do Better? 👀

The main choice I made that doesn't align so well with keeping the bot lean is using MongoDB as the database. Here's the output from top on the server as we speak, sorted by memory usage:

Mongo is using a bit over 3 gigabytes of resident memory, an amount that is similar to the bot process itself. Now to be fair, I could reduce that by setting --wiredTigerCacheSizeGB a lot lower and any performance degradation would probably be pretty minimal. Since the bot has over 9 gigabytes of available memory, and over 3 gigabytes completely free, there's just no need.

I could have used PostgreSQL instead and it likely would have given me a bit more RAM headroom and a lot more CPU headroom, but the real resource-miser move would have been to use SQLite, which would have been sufficient for this bot. SQLite is often misunderstood as a toy database, and it's further often assumed that that alleged toy-ness also means it has poor performance. While it's a nuanced topic, that's a broadly unfair assessment as SQLite avoids overhead from...

Inter-process communication (this one's big)
Access control
Row-level locking
...more

SQLite doesn't have a lot of these big-boy features, which gives it a performance boost for low-concurrency access patterns (which is the case for my bot).

That said, I'm comfortable with the decision to use Mongo in hindsight. I used it as a learning opportunity to learn a new database, it fits on the instance, I like the official tools (Mongo Compass), and it gets me where I need to go. But it would be on the chopping block if I needed to claw back more memory and CPU headroom.

Conclusion

Welp that's how I do it! It's a combination of utilizing the great free hosting options available, choosing fast dependencies, and making a few targeted optimizations. Feel free to comment if you have any questions or comments.

Randall @mistval