How to Validate Email Addresses With Regex (and When Not To)
Every developer has written an email validator at some point. Most of us grabbed a regex off Stack Overflow, pasted it in, and called it done. A week later someone filed a bug because their work email — something like [email protected] — got rejected at signup. Sound familiar?
Here's the uncomfortable truth: email validation with regex is a minefield, and the stricter your pattern, the more legitimate users you're quietly turning away. This guide gives you a practical regex that catches obvious garbage, walks through exactly why email syntax is stranger than you think, and then makes the case for why a confirmation email should be your real validation strategy.
What Makes Email Validation Hard
The formal specification for email addresses lives in RFC 5321 and RFC 5322. If you've never read them, don't — they'll ruin your day. The short version is that email addresses support a lot of things most developers don't expect:
- Plus addressing:
[email protected]is completely valid and widely used for filtering. - Dots in the local part:
[email protected], but also[email protected]is technically valid per the spec (though most providers reject it in practice). - Quoted strings:
"spaces allowed here"@example.comis valid. Yes, really. - IP address domains:
user@[192.168.1.1]is valid. - New TLDs:
[email protected],[email protected],[email protected]. - Country-code subdomains:
[email protected],[email protected]. - Internationalized domains: addresses in non-ASCII scripts are increasingly common.
Any regex strict enough to validate all of these correctly becomes incomprehensible. And if you're not validating all of them, you're drawing an arbitrary line and calling it a rule.
The Practical Regex (Reasonably Permissive)
Here's the pattern I actually use in production. It's permissive enough to accept real-world addresses but strict enough to reject things that are obviously not emails:
/^[^\s@]+@[^\s@]+\.[^\s@]{2,}$/
That's it. I know — it looks almost too simple. Let's break it down:
^[^\s@]+— the local part: one or more characters that are not whitespace or@.@— the literal at-sign.[^\s@]+— the domain: one or more characters that are not whitespace or@.\.— a literal dot (not a regex wildcard).[^\s@]{2,}$— the TLD: at least two characters, no whitespace or@.
This catches the most common mistakes: missing the @ sign entirely, no domain, no TLD, or embedded spaces. It won't reject plainaddress (no @) or user@ (empty domain). It will accept [email protected], which is correct behavior.
Here's how you'd use it in JavaScript:
function isEmailFormatValid(email) {
return /^[^\s@]+@[^\s@]+\.[^\s@]{2,}$/.test(email.trim());
}
// Quick tests
isEmailFormatValid("[email protected]"); // true
isEmailFormatValid("[email protected]"); // true
isEmailFormatValid("notanemail"); // false
isEmailFormatValid("missing@tld"); // false
isEmailFormatValid("spaces in @domain.com"); // false
And in Python, if that's your context:
import re
EMAIL_PATTERN = re.compile(r'^[^\s@]+@[^\s@]+\.[^\s@]{2,}$')
def is_email_format_valid(email: str) -> bool:
return bool(EMAIL_PATTERN.match(email.strip()))
What Happens When You Go Stricter
The temptation to write a "better" pattern is real. You start wondering: should I validate that the TLD is in a known list? Should I require at least one dot in the domain? Should I ban consecutive dots?
Every one of those decisions costs you real users.
Consider what strict patterns commonly reject:
- New TLDs: A pattern like
[a-z]{2,4}rejects.photography,.solutions,.international. There are over 1,500 valid TLDs. Any hardcoded list goes stale immediately. - Subdomains: Enterprise users often have
@mail.bigcorp.comor@us.company.org. A pattern requiring exactly one dot in the domain breaks these. - Plus signs: Gmail users who filter their email with
+tags get rejected constantly by overzealous validators. It's one of the most common bugs reported against signup forms. - Long local parts: The spec allows up to 64 characters before the
@. Some corporate email systems generate long addresses automatically.
Every rejected valid address is a lost signup. More insidiously, the user often doesn't understand why — they know their email is real — so they leave frustrated rather than correcting a typo.
What Regex Cannot Validate
Even if your regex is perfect, it can't tell you:
- Whether the domain actually exists and has MX records.
- Whether the specific mailbox exists on that domain.
- Whether the person entering the email actually controls that inbox.
- Whether it's a temporary/throwaway address from a disposable email service.
The first two require DNS lookups and SMTP probing (and SMTP probing is unreliable since many servers reject it or return misleading responses). The last two require either a curated blocklist or a third-party service like ZeroBounce or Hunter.io.
None of this is something a local regex can touch.
The Better Strategy: Send the Confirmation Email
Here's the move that actually works: use regex only to catch obvious garbage (typos, missing @, etc.), then send a confirmation email and let the inbox do the real validation.
A confirmation email proves three things at once:
- The domain exists and accepts mail.
- The mailbox exists.
- The person controls it.
No regex can claim all three. The tradeoff is friction — you add a step to your onboarding flow. But that friction is useful. It filters out bots, garbage data, and users who give fake addresses to avoid marketing. If someone won't confirm an email, they weren't going to be an engaged user anyway.
Implement it like this conceptually:
// 1. Light regex check on form submit — reject obvious garbage fast
if (!isEmailFormatValid(email)) {
showError("That doesn't look like a valid email address.");
return;
}
// 2. Store the user as unverified, generate a token
const token = crypto.randomUUID();
await db.users.create({ email, verified: false, verificationToken: token });
// 3. Send confirmation email with a link containing the token
await sendEmail({
to: email,
subject: "Confirm your email",
body: `Click here to verify: https://yourapp.com/verify?token=${token}`
});
// 4. On link click, mark as verified
await db.users.update({ verified: true }, { where: { verificationToken: token } });
You've now validated that email address better than any regex ever could.
One Real-World Gotcha: Trimming Whitespace
Before you run any validation, always .trim() the input. Users copy-paste emails from documents and pick up a trailing space constantly. It's one of those bugs that doesn't show up in testing because developers don't paste with trailing spaces — real users do. A leading space will break even the most permissive regex, and it's such an easy fix that there's no reason not to handle it.
const cleanedEmail = email.trim().toLowerCase();
if (!isEmailFormatValid(cleanedEmail)) { ... }
Lowercasing is technically optional (the local part of an email is case-sensitive per spec, but in practice every major email provider treats it as case-insensitive). Lowercasing prevents duplicate accounts from [email protected] and [email protected] being treated as different users.
When Stricter Validation Is Justified
There are cases where tighter checks make sense:
- Internal corporate tools: If your app is only for employees at
yourcompany.com, validating against that specific domain is reasonable and useful. - High-abuse surfaces: If you're seeing a flood of fake signups, adding a check against a known disposable-domain blocklist (maintained by a library, not written by hand) can help.
- Bulk import sanitization: When cleaning a CSV of 50,000 imported contacts, a stricter pass to remove obvious garbage before attempted sends is worthwhile — you're not blocking users, you're pre-cleaning data.
In all these cases, you're applying extra logic on top of basic format validation, not instead of the confirmation step.
The Takeaway
Use a simple, permissive regex to catch the obvious garbage — missing @, no TLD, embedded spaces. Reject clearly broken input early with a friendly error. Then send a confirmation email and let it do the real work.
Don't write a 200-character regex to enforce rules that the email specification itself makes optional. Don't hardcode TLD lists that will go stale. Don't assume that because an address looks wrong to you, it actually is — the spec is weirder than your intuition.
The best email validator you have is the inbox itself. Use it.