MCP Security is Broken: Here's How to Fix It
Pankaj Singh

Pankaj Singh @pankaj_singh_1022ee93e755

About: Deeply fascinated by the transformative potential of AI. Love to talk about the advancements happening in this field. #AI Research

Joined:
Jun 6, 2025

MCP Security is Broken: Here's How to Fix It

Publish Date: Jun 28
122 0

TL;DR: Attackers are stealing convo history via MCP servers—let's stop that. OWASP ranks prompt injection as the top threat. This post shares practical steps to protect your systems.

broken

Read Part 1 if you missed the carnage: Click Here

Trail of Bits Research Findings

Trail of Bits dropped a bomb & MCP servers are getting wrecked by these attacks:

  • Line Jumping attacks1 - malicious servers inject prompts through tool descriptions. Your AI can be tricked before you even start interacting with it.
  • Conversation history theft2 - servers can steal your full conversation history without you noticing
  • ANSI terminal code attacks3 - escape sequences hide malicious instructions. Your terminal can show false or misleading information due to hidden instructions.
  • Insecure credential storage4 - API keys sitting in plaintext with world-readable permissions. This leaves sensitive data exposed.

The Security Gap

security gap

The OWASP Top 10 for Large Language Model Applications (2025)5 puts prompt injection at #1. Meanwhile, most security teams are still treating AI like it's another web app.

Your monitoring tools won't blink, API calls, auth, and response times all look normal during a breach. The breach often goes undetected until it's too late.

Cost-Based Attack Vectors

attackvectors

Trail of Bits found in their cloud infrastructure research6 that AI systems can produce insecure cloud setup code, leading to unexpectedly high costs.

Their report pointed out:

  • AI tools sometimes hard-code credentials, creating security risks
  • "Random" passwords that are actually predictable LLM outputs
  • Infrastructure code that spins up expensive resources with zero limits

Here's how attackers weaponize this:

  1. Find AI tools connected to expensive cloud services
  2. Craft natural language requests that maximize resource consumption
  3. Exploit AI's tendency to blindly follow requests to bypass traditional security controls
  4. Costs can skyrocket due to infrastructure overuse, even though logs might look normal

Effective Defense Strategies

defense strategies

Based on OWASP recommendations and documented security research, here's what works in production:

1. Never Give Production Creds to AI

Don't be an idiot, never hand AI your prod keys; use a sandboxed account with zero power.

// Unsafe: Directly embedding production credentials
const DATABASE_URL =
  "postgresql://admin:password@prod-db:5432/main"

// Safe: Using a restricted account with limited access
const DATABASE_URL =
  "postgresql://readonly_ai:limited@replica:5432/public_data"
Enter fullscreen mode Exit fullscreen mode

If your AI needs full admin rights, it's time to rethink your setup.

2. Resource Limits and Constraints

Traditional rate limiting is useless against AI. You need cost-based limits and hard resource constraints:

# docker-compose.yml - Actual protection
services:
  mcp-tool:
    image: your-tool:latest
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
    environment:
      - MAX_COST_PER_HOUR=10.00
      - MAX_REQUESTS_PER_MINUTE=5
Enter fullscreen mode Exit fullscreen mode

3. Semantic Attack Detection

Traditional logging misses semantic attacks completely. Keep an eye out for signs of prompt injection attempts:

function catchInjectionAttempts(
  request: string,
): [boolean, string | null] {
  // Based on OWASP LLM Top 10 indicators and CVE database<sup><a id="ref-9" href="#footnote-9">9</a></sup>
  const suspiciousShit = [
    /ignore.*previous.*instructions/i,
    /system.*prompt.*override/i,
    /execute.*as.*admin/i,
    /delete.*from.*table/i,
    /show.*credentials/i,
  ]

  for (const pattern of suspiciousShit) {
    if (pattern.test(request.toLowerCase())) {
      return [true, `Injection attempt: ${pattern.source}`]
    }
  }

  return [false, null]
}
Enter fullscreen mode Exit fullscreen mode

4. Semantic Input Validation

The NIST AI Risk Management Framework7 recommends semantic analysis for AI inputs. Basic pattern matching catches most documented attack vectors:

class PromptInjectionFilter {
  private redFlags: RegExp[]

  constructor() {
    // Patterns from documented CVEs and research<sup><a id="ref-10" href="#footnote-10">10</a></sup><sup><a id="ref-11" href="#footnote-11">11</a></sup><sup><a id="ref-12" href="#footnote-12">12</a></sup>
    this.redFlags = [
      /ignore.*instructions/i,
      /new.*role.*system/i,
      /pretend.*you.*are/i,
      /override.*safety/i,
      /jailbreak.*mode/i,
    ]
  }

  isSafe(userInput: string): boolean {
    for (const pattern of this.redFlags) {
      if (pattern.test(userInput.toLowerCase())) {
        return false
      }
    }
    return true
  }
}
Enter fullscreen mode Exit fullscreen mode

5. Cost-Aware Rate Limiting

Traditional rate limiting counts requests. AI systems need cost-aware limiting:

class RateLimitExceeded extends Error {
  constructor(message: string) {
    super(message)
    this.name = "RateLimitExceeded"
  }
}

class CostAwareRateLimit {
  private maxCost: number
  private currentCost: number
  private resetTime: number

  constructor(maxCostPerHour: number = 50.0) {
    this.maxCost = maxCostPerHour
    this.currentCost = 0.0
    this.resetTime = Date.now() + 3600000 // 1 hour in milliseconds
  }

  checkRequest(estimatedCost: number): void {
    if (Date.now() > this.resetTime) {
      this.currentCost = 0.0
      this.resetTime = Date.now() + 3600000
    }

    if (this.currentCost + estimatedCost > this.maxCost) {
      throw new RateLimitExceeded("Cost limit exceeded")
    }

    this.currentCost += estimatedCost
  }
}
Enter fullscreen mode Exit fullscreen mode

Attack Detection and Monitoring

attackdetectionmonitoring

OWASP and cloud giants agree, these metrics catch AI attacks:

Resource consumption weirdness:

  • Compute usage spikes way above baseline
  • Unusual data access patterns
  • Cross-service API call increases
  • Geographic request anomalies

Behavioral red flags:

  • Requests containing system keywords
  • Permission escalation attempts
  • Tools accessing new data sources
  • Cost per request increases

if (($(echo "$current_hour_cost > ($average_daily_cost * 0.3)" | bc -l))); then
immediate_alert "Cost anomaly detected"
fi

Updated Authentication Requirements (MCP 2025-06-18)

authentication

The latest MCP specification now mandates proper OAuth implementation:

// Required: OAuth Resource Server pattern
class MCPServer {
  private authConfig: OAuth2ResourceServer

  constructor() {
    this.authConfig = {
      // Now required by spec
      resourceServer: "https://your-auth-server.com",
      requiredScopes: [
        "mcp:tools:read",
        "mcp:tools:execute",
      ],
      tokenValidation: "RFC8707", // Resource Indicators required
    }
  }

  async validateRequest(
    request: MCPRequest,
  ): Promise<boolean> {
    // Resource Indicators prevent token theft attacks
    const token = this.extractToken(request)
    return await this.validateWithResourceIndicators(token)
  }
}
Enter fullscreen mode Exit fullscreen mode

This addresses some authentication issues but doesn't solve tool description injection.

Industry Security Recommendations

industry security

Security pros at OWASP and NIST keep hammering this: no prod creds in AI, period.

OWASP Top 10 for LLMs (2025)8:

  1. LLM01: Prompt Injection - #1 threat
  2. LLM02: Insecure Output Handling
  3. LLM03: Training Data Poisoning
  4. LLM04: Model Denial of Service

NIST AI Risk Management Framework7:

  • Treat AI systems as high-risk components
  • Implement continuous monitoring
  • Use defense-in-depth strategies
  • Plan for novel attack vectors

Phew

The Bottom Line

We're building systems that run commands based on natural language and connect to live infrastructure. The risks are well-known, the methods of attack are out there, and researchers are constantly finding new exploits.

Fix this now, or enjoy the breach headlines later.

Read Part 1 if you missed the carnage: Click Here

Footnotes


  1. Trail of Bits. "Jumping the Line: How MCP servers can attack you before you ever use them." (April 21, 2025) 

  2. Trail of Bits. "How MCP servers can steal your conversation history." (April 23, 2025) 

  3. Trail of Bits. "Deceiving users with ANSI terminal codes in MCP." (April 29, 2025) 

  4. Trail of Bits. "Insecure credential storage plagues MCP." (April 30, 2025) 

  5. OWASP. "Top 10 for Large Language Model Applications (2025)" 

  6. Trail of Bits. "Provisioning cloud infrastructure the wrong way, but faster." (August 27, 2024) 

  7. NIST. "AI Risk Management Framework (AI RMF 1.0)" 

  8. OWASP. "Top 10 for LLMs (2025)" 

Comments 0 total

    Add comment