MCP Security is Broken: Here's How to Fix It

TL;DR: Attackers are stealing convo history via MCP servers—let's stop that. OWASP ranks prompt injection as the top threat. This post shares practical steps to protect your systems.

Read Part 1 if you missed the carnage: Click Here

Trail of Bits Research Findings

Trail of Bits dropped a bomb & MCP servers are getting wrecked by these attacks:

Line Jumping attacks¹ - malicious servers inject prompts through tool descriptions. Your AI can be tricked before you even start interacting with it.
Conversation history theft² - servers can steal your full conversation history without you noticing
ANSI terminal code attacks³ - escape sequences hide malicious instructions. Your terminal can show false or misleading information due to hidden instructions.
Insecure credential storage⁴ - API keys sitting in plaintext with world-readable permissions. This leaves sensitive data exposed.

The Security Gap

The OWASP Top 10 for Large Language Model Applications (2025)⁵ puts prompt injection at #1. Meanwhile, most security teams are still treating AI like it's another web app.

Your monitoring tools won't blink, API calls, auth, and response times all look normal during a breach. The breach often goes undetected until it's too late.

Cost-Based Attack Vectors

Trail of Bits found in their cloud infrastructure research⁶ that AI systems can produce insecure cloud setup code, leading to unexpectedly high costs.

Their report pointed out:

AI tools sometimes hard-code credentials, creating security risks
"Random" passwords that are actually predictable LLM outputs
Infrastructure code that spins up expensive resources with zero limits

Here's how attackers weaponize this:

Find AI tools connected to expensive cloud services
Craft natural language requests that maximize resource consumption
Exploit AI's tendency to blindly follow requests to bypass traditional security controls
Costs can skyrocket due to infrastructure overuse, even though logs might look normal

Effective Defense Strategies

Based on OWASP recommendations and documented security research, here's what works in production:

1. Never Give Production Creds to AI

Don't be an idiot, never hand AI your prod keys; use a sandboxed account with zero power.

// Unsafe: Directly embedding production credentials
const DATABASE_URL =
  "postgresql://admin:password@prod-db:5432/main"

// Safe: Using a restricted account with limited access
const DATABASE_URL =
  "postgresql://readonly_ai:limited@replica:5432/public_data"

If your AI needs full admin rights, it's time to rethink your setup.

2. Resource Limits and Constraints

Traditional rate limiting is useless against AI. You need cost-based limits and hard resource constraints:

# docker-compose.yml - Actual protection
services:
  mcp-tool:
    image: your-tool:latest
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
    environment:
      - MAX_COST_PER_HOUR=10.00
      - MAX_REQUESTS_PER_MINUTE=5

3. Semantic Attack Detection

Traditional logging misses semantic attacks completely. Keep an eye out for signs of prompt injection attempts:

function catchInjectionAttempts(
  request: string,
): [boolean, string | null] {
  // Based on OWASP LLM Top 10 indicators and CVE database<sup><a id="ref-9" href="#footnote-9">9</a></sup>
  const suspiciousShit = [
    /ignore.*previous.*instructions/i,
    /system.*prompt.*override/i,
    /execute.*as.*admin/i,
    /delete.*from.*table/i,
    /show.*credentials/i,
  ]

  for (const pattern of suspiciousShit) {
    if (pattern.test(request.toLowerCase())) {
      return [true, `Injection attempt: ${pattern.source}`]
    }
  }

  return [false, null]
}

4. Semantic Input Validation

The NIST AI Risk Management Framework⁷ recommends semantic analysis for AI inputs. Basic pattern matching catches most documented attack vectors:

class PromptInjectionFilter {
  private redFlags: RegExp[]

  constructor() {
    // Patterns from documented CVEs and research<sup><a id="ref-10" href="#footnote-10">10</a></sup><sup><a id="ref-11" href="#footnote-11">11</a></sup><sup><a id="ref-12" href="#footnote-12">12</a></sup>
    this.redFlags = [
      /ignore.*instructions/i,
      /new.*role.*system/i,
      /pretend.*you.*are/i,
      /override.*safety/i,
      /jailbreak.*mode/i,
    ]
  }

  isSafe(userInput: string): boolean {
    for (const pattern of this.redFlags) {
      if (pattern.test(userInput.toLowerCase())) {
        return false
      }
    }
    return true
  }
}

5. Cost-Aware Rate Limiting

Traditional rate limiting counts requests. AI systems need cost-aware limiting:

class RateLimitExceeded extends Error {
  constructor(message: string) {
    super(message)
    this.name = "RateLimitExceeded"
  }
}

class CostAwareRateLimit {
  private maxCost: number
  private currentCost: number
  private resetTime: number

  constructor(maxCostPerHour: number = 50.0) {
    this.maxCost = maxCostPerHour
    this.currentCost = 0.0
    this.resetTime = Date.now() + 3600000 // 1 hour in milliseconds
  }

  checkRequest(estimatedCost: number): void {
    if (Date.now() > this.resetTime) {
      this.currentCost = 0.0
      this.resetTime = Date.now() + 3600000
    }

    if (this.currentCost + estimatedCost > this.maxCost) {
      throw new RateLimitExceeded("Cost limit exceeded")
    }

    this.currentCost += estimatedCost
  }
}

Attack Detection and Monitoring

OWASP and cloud giants agree, these metrics catch AI attacks:

Resource consumption weirdness:

Compute usage spikes way above baseline
Unusual data access patterns
Cross-service API call increases
Geographic request anomalies

Behavioral red flags:

Requests containing system keywords
Permission escalation attempts
Tools accessing new data sources
Cost per request increases

if (($(echo "$current_hour_cost > ($average_daily_cost * 0.3)" | bc -l))); then immediate_alert "Cost anomaly detected" fi

Updated Authentication Requirements (MCP 2025-06-18)

The latest MCP specification now mandates proper OAuth implementation:

// Required: OAuth Resource Server pattern
class MCPServer {
  private authConfig: OAuth2ResourceServer

  constructor() {
    this.authConfig = {
      // Now required by spec
      resourceServer: "https://your-auth-server.com",
      requiredScopes: [
        "mcp:tools:read",
        "mcp:tools:execute",
      ],
      tokenValidation: "RFC8707", // Resource Indicators required
    }
  }

  async validateRequest(
    request: MCPRequest,
  ): Promise<boolean> {
    // Resource Indicators prevent token theft attacks
    const token = this.extractToken(request)
    return await this.validateWithResourceIndicators(token)
  }
}

This addresses some authentication issues but doesn't solve tool description injection.

Industry Security Recommendations

Security pros at OWASP and NIST keep hammering this: no prod creds in AI, period.

OWASP Top 10 for LLMs (2025)⁸:

LLM01: Prompt Injection - #1 threat
LLM02: Insecure Output Handling
LLM03: Training Data Poisoning
LLM04: Model Denial of Service

NIST AI Risk Management Framework⁷:

Treat AI systems as high-risk components
Implement continuous monitoring
Use defense-in-depth strategies
Plan for novel attack vectors

The Bottom Line

We're building systems that run commands based on natural language and connect to live infrastructure. The risks are well-known, the methods of attack are out there, and researchers are constantly finding new exploits.

Fix this now, or enjoy the breach headlines later.

Read Part 1 if you missed the carnage: Click Here

Pankaj Singh @pankaj_singh_1022ee93e755