Why experienced developers never use regex for email validation?
Nik L.

Nik L. @nikl

About: Helping devs market for free. Let's connect if we haven't yet!

Location:
San Francisco, US.
Joined:
Mar 28, 2023

Why experienced developers never use regex for email validation?

Publish Date: Dec 5 '24
75 15

The Problem No One Talks About

Let's be real: email validation sounds simple, but it's a technical trap that catches even experienced developers.

What's Really Going On?

Imagine you're building a sign-up form. Your first instinct? Throw a regex at the email field. Bad move.

Actual Valid Weird Emails

# These are ALL technically valid emails!
valid_emails = [
    '"J. R. \"Bob\" Dobbs"@example.com',
    'admin@mailserver1',
    'user+tag@gmail.com',
    'postmaster@[123.123.123.123]'
]
Enter fullscreen mode Exit fullscreen mode

Most regex engines would choke on these.

Why?

Email standards are wild.

Most developers would be surprised to learn that those were actually a technically valid email address according to RFC 5322. The specification allows:

  • Quoted local parts
  • Comments within parentheses
  • Nested comments
  • Special characters in local parts
  • Multiple domain labels

The Hidden Costs of Bad Validation

1. Losing Real Users

A strict regex might reject perfectly good email addresses. Imagine turning away a potential customer because their email looks "weird", like having:

  • Plus addressing (user+tags@gmail.com)
  • Unconventional domain structures
  • International character sets
  • Legitimate but complex naming conventions

Your product team would be really unhappy, moreso; the sales would be really pissed.

2. ReDoS Attacks

Regex engines using backtracking are susceptible to Regex Denial of Service (ReDoS) attacks.

def dangerous_regex_check(user_input):
    # This regex can destroy your server's performance
    evil_pattern = r'^(a+)+b$'
    return re.match(evil_pattern, user_input)

# Just 30 characters can crash your system
malicious_input = 'a' * 30 + 'b'
Enter fullscreen mode Exit fullscreen mode

Attackers can craft inputs that make your validation function crawl to a halt.

A Smarter Approach

Basic Validation That Actually Works

def smart_email_check(email):
    """Quick and dirty email sanity check"""
    return (
        email and 
        '@' in email and 
        len(email) <= 254  # Email length limit
    )
Enter fullscreen mode Exit fullscreen mode

The Real Solution: Verification

  1. Basic syntax check
  2. Send a verification link
  3. Let the user prove the email works
def validate_email(email):
    if not basic_email_check(email):
        return False

    # Send verification token
    token = generate_unique_token()
    send_verification_email(email, token)

    return True
Enter fullscreen mode Exit fullscreen mode

Pro Tools for Real Developers

Instead of writing your own regex, use tested libraries:

  • Python: email-validator
  • JavaScript: validator.js
  • Java: Apache Commons Validator

A Better Validation Class

class EmailValidator:
    @staticmethod
    def validate(email):
        """
        Smart email validation
        - Quick syntax check
        - Verify deliverability
        """
        try:
            # Use a smart library
            validate_email(
                email, 
                check_deliverability=True
            )
            return True
        except EmailInvalidError:
            return False
Enter fullscreen mode Exit fullscreen mode

The Bottom Line

Email validation isn't about creating an unbreakable fortress. It's about:

  • Letting real users in
  • Keeping your system safe
  • Not making things complicated

Key Takeaways

  1. Forget complex regex
  2. Use proven libraries
  3. Send verification emails
  4. Be user-friendly

Developers who get this right save themselves countless headaches.

Want me to break down any part of this further?

Btw, I'm working on an unlimited context tool, where you can use your preferred LLM without needing to give the context again and again.
Do check this out; it's completely free for developers.

🛠️ Features of Pieces

Feature What It Does
✂️ Smart Snippet Capture Automatically saves code snippets from IDEs, browsers, or text to a repository.
🔍 Contextual Search Allows instant retrieval of code snippets using metadata and AI-enhanced tags.
🌐 Offline Support Provides full functionality without internet, ensuring privacy and security.
🤖 AI-Driven Context Suggests relevant snippets based on context, programming language, and usage.
IDE Integration Offers personalized code autocompletion through plugins for VS Code and JetBrains.


Read more

Comments 15 total

  • Ben Sinclair
    Ben SinclairDec 6, 2024

    Bug spotted: the "Basic Validation That Actually Works" example here will fail on admin@mailserver1 which you previously recognised as valid.

    • Nik L.
      Nik L.Dec 6, 2024

      thanks, that's true for single label domain, have amended.

  • Samuel Rouse
    Samuel RouseDec 6, 2024

    A really simple validator is to use the built-in field validation of the email input type.

    export const checkEmail = (emailString, multiple=false) => {
     const el = document.createElement('input');
      el.type = 'email';
      el.multiple = multiple;
      el.value = emailString;
      return el.checkValidity();
    };
    
    Enter fullscreen mode Exit fullscreen mode

    If you plan to call it frequently you could persist the DOM element.

    const emailElement = document.createElement('input');
    emailElement.type = 'email';
    
    export const checkEmail = (emailString, multiple=false) => {
      emailElement.multiple = multiple;
      emailElement.value = emailString;
      return emailElement.checkValidity();
    };
    
    Enter fullscreen mode Exit fullscreen mode
  • Vadim
    VadimDec 7, 2024

    In a browser, use input type="email"

    • Eugene
      EugeneDec 7, 2024

      It’s only a client-side validation that can be easily bypassed. Server-side validation always required.

      • Nik L.
        Nik L.Dec 7, 2024

        that's the ideal case: a basic client-side validation coupled with a validation from the server side.

      • Vadim
        VadimDec 7, 2024

        Of course, nobody argues about that )

  • Eugene
    EugeneDec 7, 2024

    Indeed, libraries are also written by ppl and might use the same techniques listed above. Typically, checking domain name to be valid is simple and enough to send confirmation letter (in most cases), everything else is on a user.

    • Nik L.
      Nik L.Dec 7, 2024

      yes, definitely for the client-side validation.

  • Alan
    AlanDec 7, 2024

    We use client side and server side validation without duplicating the code with trpc, zod and npm workspace as Zod is in a shared package. So it can be used by the client and the server

    github.com/alan345/TER

  • felix
    felixDec 9, 2024

    HAHAHAHA i tried to make an account for this site
    Image description
    also, subgenius reference ;) i thought everyone had forgotten already

    • felix
      felixDec 9, 2024

      srsly though the email is 76 characters and once i tried to create an account and the form max length was 75 (removing the html limit worked lol)

  • steve
    steveDec 11, 2024

    Devs who don't understand regex or email validation may have this problem. Eitherway its handled on the backend anyway. Front end validation is just to save an api request and improve ui experience. Untested and yes problems arise. My tip: don't use 1 regex. Break it up to reduce complexity. Or keep it simple and forgiving. /[^@]+\@[^\@.]+(.[^\@]+)*/. Or better yet just use zod or yup.

  • Bernardo Euler
    Bernardo EulerDec 22, 2024

    Nice article

Add comment