How NodeJS Made Me a Masochist: Building a Real-Time Web App in C++ (Part 1)

Or: How I went from "just use Express.js" to "let me implement TCP sockets from scratch because I have apparently lost my mind"

The Descent Into Madness

Picture this: You're a college student who's built a few web apps with Node.js and React. Life is good. npm install express solves all your problems. CORS? There's a middleware for that. WebSockets? Just npm install socket.io and boom, real-time magic happens.

But then, like an idiot, I asked the question that ruined everything: "But how does this actually work?"

That innocent question spiraled into what I'm now calling "see-plus-plus" - a real-time ASCII webcam streaming server built entirely in C++ with hand-rolled WebSockets, because apparently I enjoy pain.

The Moment Everything Changed

It started when I was building a simple chat app for a class project. I copy-pasted the usual Node.js setup:

const express = require('express');
const http = require('http');
const socketIO = require('socket.io');

const app = express();
const server = http.createServer(app);
const io = socketIO(server);

io.on('connection', (socket) => {
  console.log('User connected');
  // Magic happens here somehow???
});

I just stared at that code. What the hell is happening in http.createServer()? How does socket.io know when someone connects? What even IS a socket?

This wasn't the first time I'd used this pattern, but something about that moment made me realize how much I was taking for granted. I was treating these powerful abstractions like black boxes, trusting that they would work without understanding the mechanisms underneath.

My professor probably expected me to just submit the chat app and move on. Instead, I went down a rabbit hole that's consumed my entire semester and possibly my sanity.

First Stop: The Uncomfortable Truth

I realized I had no clue how the internet actually works. Sure, I knew HTTP was a protocol and TCP was something underneath it, but I couldn't explain how my browser talks to a server if my life depended on it. I knew packets traveled across networks, but what was in those packets? I understood that servers listened on ports, but what did "listening" actually mean at the operating system level?

This knowledge gap felt profound and embarrassing. I'd been building web applications for two years, but I couldn't explain the fundamental mechanisms that made them possible. It was like being a chef who could follow recipes perfectly but had no idea what heat actually does to food.

So I did what any reasonable person would do: I decided to build the entire web stack from scratch. If I couldn't understand it by reading about it, maybe I could understand it by implementing it myself.

"How hard could it be?" - Famous last words

The TCP Layer: Where Reality Hit Hard

My first mission was simple: create a server that could accept connections and send messages back and forth. No frameworks, no libraries, just raw C++ and Berkeley sockets.

Here's what I thought would be easy:

// Step 1: Create socket (this should be simple, right?)
int server_socket = socket(AF_INET, SOCK_STREAM, 0);

Narrator: It was not simple.

What followed was a crash course in everything I didn't know I didn't know. Every single parameter in that function call represented concepts I'd never encountered. AF_INET isn't just a random constant - it literally means "hey kernel, we're doing IPv4 stuff here." The choice of address family determines how addresses are formatted and what kinds of endpoints can communicate.

SOCK_STREAM means TCP - reliable, ordered delivery with error correction. The alternative, SOCK_DGRAM, gives you UDP - fire-and-forget packets with no guarantees. This seemingly simple choice represents fundamentally different approaches to network communication.

Then came network byte order. Different computer architectures store multi-byte numbers differently - some put the most significant byte first, others put it last. Network protocols standardize on big-endian, so functions like htons() exist to translate between your machine's byte order and the network's expected format.

My first attempt crashed with a segmentation fault. My second attempt bound to the wrong port because I'd forgotten the byte order conversion. My third attempt worked once, then refused to restart because of some "Address already in use" error.

That's when I learned about SO_REUSEADDR:

// This little flag saved my sanity during development
int opt = 1;
if (setsockopt(server_socket, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)) < 0) {
    perror("Why does everything hate me");
    return false;
}

The "Address already in use" error happens because TCP connections don't disappear immediately when you close them. They enter a "TIME_WAIT" state for a few minutes to ensure delayed packets don't interfere with new connections. This is great for network reliability, but terrible when you're restarting your server every thirty seconds during development.

Threading: Opening Pandora's Box

Once I had a basic server accepting connections, I hit the next wall: handling multiple clients simultaneously. My initial approach was embarrassingly naive:

// Accept connection
int client_socket = accept(server_socket, ...);

// Handle client (BLOCKING - only one client at a time)
handle_client(client_socket);

// Accept next connection... eventually

This meant my server could only talk to one person at a time. It was like having a restaurant with one waiter who had to completely finish serving the first customer before even acknowledging anyone else existed.

The problem is that handle_client() is a blocking operation. It sits there waiting for the client to send data, and if the client never sends anything, the entire server is stuck.

Enter threading:

// Spawn a thread for each client
std::thread client_thread(&Server::handle_client_threaded, this, client_socket, client_addr);
client_thread.detach(); // YOLO - thread manages its own lifetime

This worked great for my first few tests with two or three concurrent connections. But I quickly realized I had no idea when threads finished, how many were running, or how to shut down gracefully. My server was like a party host who kept inviting people over but lost track of who was there.

The detach() call was particularly problematic. It tells the thread "manage your own lifetime, I don't want to hear from you again." This seems convenient, but it also means you lose all control over the thread.

The Great Thread Management Saga

The solution involved learning about atomic operations and thread lifetime management:

std::atomic<int> active_clients{0};
const int MAX_CLIENTS = 10;

// In the accept loop
if (active_clients.load() >= MAX_CLIENTS) {
    std::cout << "Sorry, we're full. Come back later." << std::endl;
    close(client_socket);
    continue;
}

active_clients.fetch_add(1);  // Atomic increment - thread-safe

std::atomic<int> solved the thread counting problem by preventing race conditions where two threads might both think there's room for one more client.

But counting threads was only half the problem. The bigger challenge was cleanup - how do you wait for all threads to finish when shutting down?

std::vector<std::thread> client_threads;  // Keep track of all spawned threads

// During shutdown
void Server::await_all() {
    std::cout << "Waiting for all client threads to finish..." << std::endl;

    for (auto& thread : client_threads) {
        if (thread.joinable()) {    // Quick check: can we wait for this thread?
            thread.join();          // Actually wait (this is the blocking part)
        }
    }

    client_threads.clear();
}

I spent an embarrassing amount of time thinking joinable() was the blocking call. It's not - it's just asking "is this thread in a state where I can wait for it?" The actual waiting happens in join().

The Zombie Connection Problem

Just when I thought I had threading figured out, I discovered zombie connections - clients that connect but never send data, just sitting there consuming server resources like digital parasites.

Picture this: you've got 10 connection slots, and some malicious script connects 5 times but never sends anything. Now you can only serve 5 real users because the other slots are occupied by ghosts.

The solution involved socket timeouts:

// Set a timeout on recv() operations
struct timeval timeout;
timeout.tv_sec = 30;  // 30 seconds to send something or get kicked
timeout.tv_usec = 0;

setsockopt(client_sock, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(timeout));

// In the receive loop
ssize_t bytes_recv = recv(client_sock, buffer, sizeof(buffer)-1, 0);

if (bytes_recv <= 0) {
    if (errno == EAGAIN || errno == EWOULDBLOCK) {
        std::cout << "Client timed out - bye zombie!" << std::endl;
    }
    break;  // Either way, this connection is done
}

Socket timeouts transform blocking operations into time-limited operations. When recv() times out, it returns -1 and sets errno to EAGAIN, giving you a chance to clean up silent connections.

The Architectural Revelation: Why My Approach Was Fundamentally Flawed

At this point, I felt pretty confident about my threading skills. I had connection limits, timeout handling, and proper cleanup. But then I did some math that made my stomach drop.

Let's say I want to support 1000 concurrent connections. With my thread-per-connection model, that means 1000 threads, each with its own 8MB stack. That's 8GB of RAM just for thread stacks, before I even start processing data.

But memory wasn't the only problem. Context switching between 1000 threads creates significant CPU overhead, and most of those threads are idle at any given moment. It's like hiring 1000 personal assistants to sit by 1000 different phones, each waiting for their specific phone to ring.

The epiphany hit me: what if I could move from "one thread per connection" to "one thread per task"? Instead of threads sitting around waiting for data, I could have a small pool of worker threads that only activate when there's actual work to do.

This model would require fundamentally different connection handling - monitoring all connections simultaneously and dispatching work only when data actually arrives. The operating system provides mechanisms for this like select() and epoll() that let you monitor thousands of connections with a single system call.

This is apparently how high-performance servers actually work, using patterns called "event loops" and "reactor patterns." My frustration with idle threads was leading me toward the same architectural solutions that power nginx and Node.js itself.

But implementing this properly would require a complete redesign - a perfect topic for Part 2.

Signal Handlers: The Art of Graceful Shutdown

Before tackling that architectural overhaul, I needed to solve a more immediate problem: handling Ctrl+C gracefully. The naive approach is to just kill the process, but this is catastrophic when you have active connections and threads running.

Consider what happens during ungraceful shutdown: threads are terminated mid-execution, file descriptors remain open, and any buffered data is lost forever. From the client's perspective, the server simply vanishes.

Signal handlers provide a way to catch shutdown requests and respond properly:

// Global pointer because signal handlers can't access class members directly
Server* Server::instance = nullptr;

void Server::signal_handler(int signal) {
    std::cout << "Received signal " << signal << ", shutting down gracefully..." << std::endl;

    if (instance != nullptr) {
        instance->shutdown_flag = true;      // Stop accepting new connections
        close(instance->server_socket);      // Break out of accept() loop
        instance->await_all();               // Wait for client threads
        instance->cleanup();                 // Clean up resources
    }

    exit(0);
}

// Register the handler
Server::Server() {
    instance = this;  // Set global pointer - ugly but necessary
    signal(SIGINT, signal_handler);   // Ctrl+C
    signal(SIGTERM, signal_handler);  // Kill command
}

The global pointer is necessary because signal handlers can only be static functions - they can't access instance variables directly. The shutdown sequence stops accepting new connections, closes the listening socket to break out of the accept loop, then waits for existing threads to finish their work.

This creates a much better experience: clients get proper connection close messages instead of abrupt disconnections, and all resources are properly cleaned up.

What I've Learned So Far

Building just the TCP foundation has taught me more about how computers actually work than any class I've taken. Each problem revealed layers of complexity I never knew existed.

The journey from "just create a socket" to a fully functional multi-threaded server has been humbling. What seemed like a few function calls turned into an exploration of operating systems, network protocols, concurrent programming, and resource management.

I now understand why multithreading is hard - race conditions and resource management aren't just theoretical problems, they're daily reality. Network programming fundamentals like byte order and address families were completely foreign six months ago, but now I understand how they enable communication across networks.

Most importantly, I've learned about scalability limitations and why architectural patterns exist. My thread-per-connection model works fine for dozens of connections, but it fundamentally cannot scale to thousands. This isn't a bug - it's a limitation of the architectural approach that led me to understand why event-driven architectures exist.

Why Am I Doing This to Myself?

Good question. I could have built a chat app with Socket.io in an afternoon. Instead, I'm building the internet from scratch like some kind of digital masochist.

But here's the thing - every time I understand a new piece of the puzzle, everything else starts making sense. When I finally implement HTTP parsing, I'll understand exactly what Express.js does. When I get WebSockets working, I'll know why Socket.io exists and what problems it solves.

I'm not just learning to use tools - I'm learning how the tools work. This deep understanding changes how I think about system design, performance, and debugging. Every convenience feature in Express.js represents hundreds of lines of careful systems programming.

There's also something deeply satisfying about building systems from first principles. Each working component represents understanding that I've internalized, not just code that functions.

What's Next

In Part 2, I'll tackle the massive architectural shift from thread-per-connection to event-driven programming. This isn't just an optimization - it's a fundamental change in how the server manages concurrent connections. I'll explore building an event loop that can handle thousands of connections with just a handful of threads.

Once I have the event-driven foundation working, I'll implement HTTP request parsing - taking raw bytes from the socket and turning them into structured data:

GET /index.html HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0...

This will involve building a state machine that can parse HTTP requests incrementally as data arrives, handling edge cases like partial requests and malformed headers.

Spoiler alert: It's going to involve state machines, I/O multiplexing, buffer management, and probably several more existential crises about why I didn't just use a library.

If you're following along or building something similar, I'd love to hear about it! You can find my code on GitHub at mush1e/see-plus-plus, and I'll be documenting the entire journey as I build this thing from the ground up.

Also, if you're a recruiter reading this and thinking "this person clearly makes questionable life choices," you're absolutely right. But I promise those questionable choices come with a burning desire for a deep understanding of how computers actually work. Hit me up - I'm looking for internships/entry-level positions where I can channel this chaos into something productive.

Mustafa Siddiqui @mush1e