Letting Playwright MCP Explore your site and Write your Tests
Debbie O'Brien

Debbie O'Brien @debs_obrien

About: Principal Technical Program Manager at Microsoft, speaker, writer, teacher, open source contributor, playwright, vue, nuxt, react

Location:
Spain
Joined:
Aug 15, 2019

Letting Playwright MCP Explore your site and Write your Tests

Publish Date: Jun 18
119 26

What if your tests could write themselves — just by using your app like a real user?

In this post, we explore how the Playwright MCP (Model Context Protocol) in Agent Mode can autonomously navigate your app, discover key functionality, and generate runnable tests — no manual scripting required.

We’ll walk through a live demo of generating and running a test against a Movies app, highlighting how the MCP uncovers edge cases, builds coverage, and even surfaces bugs you might miss.

🔧 Setting the Stage
For this demo, I’ve got the MCP Playwright server running locally inside my .vscode project folder in a file called mcp.json.

{
    "servers": {
        "playwright": {
            "command": "npx",
            "args": [
                "@playwright/mcp@latest"
            ]
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

I’ve prepared a simple test prompt which is located in the .github folder and named it generate_tests.prompt.md:

---
tools: ['playwright']
mode: 'agent'
---

- You are a playwright test generator.
- You are given a scenario and you need to generate a playwright test for it.
- DO NOT generate test code based on the scenario alone. 
- DO run steps one by one using the tools provided by the Playwright MCP.
- When asked to explore a website:
  1. Navigate to the specified URL
  2. Explore 1 key functionality of the site and when finished close the browser.
  3. Implement a Playwright TypeScript test that uses @playwright/test based on message history using Playwright's best practices including role based locators, auto retrying assertions and with no added timeouts unless necessary as Playwright has built in retries and autowaiting if the correct locators and assertions are used.
- Save generated test file in the tests directory
- Execute the test file and iterate until the test passes
- Include appropriate assertions to verify the expected behavior
- Structure tests properly with descriptive test titles and comments
Enter fullscreen mode Exit fullscreen mode

Then in VS Code I use Agent Mode and make sure my prompt is added to context and then I simply type:

Explore https://debs-obrien.github.io/playwright-movies-app
Enter fullscreen mode Exit fullscreen mode

Agent mode uses the Playwright MCP to navigate to the site and use the browser to explore the app like a real user.

🧠 Goal: Let the agent freely navigate, discover functionality, and generate tests automatically based on its interactions.

🧪 Exploration Begins
Once the agent starts exploring, the first thing it tries is the search feature. It types “Star Wars” into the search bar — and immediately, we uncover a bug.

The search results show “Star Wars”, but the movie title returned is “Kill”. That’s clearly wrong.

This is an edge case I hadn’t noticed in manual testing. I’d previously searched terms like Garfield, Deadpool, and Avengers — and everything worked fine. But now, thanks to the agent’s autonomous behavior, I’ve uncovered a regression.

✅ Result: The agent discovered a search issue — something I’d missed entirely.

🌓 Theme Toggling and UI Coverage
Next, the agent toggles the app’s theme switch — switching between dark and light mode. It verifies that the toggle works, clicks through navigation links, and continues its exploratory crawl.

After wrapping up the interactions, the agent summarizes its findings:

  • Homepage
  • Search functionality
  • Movie details page
  • Theme toggle
  • Navigation

From that list, it selects search functionality as the focus for the test it will generate.

🎯 Note: You can tell the agent how many tests you want. In this case, I requested just one for the demo.

🧾 Test Generation & Execution
The agent generates a full Playwright test file based on the interactions. It even fixes a lint error automatically before running the test.

Here’s the test it generated:

import { test, expect } from '@playwright/test';

test.describe('Movie search', () => {
  test('Search for a movie by title', async ({ page }) => {
    // Navigate to the movies app
    await page.goto('https://debs-obrien.github.io/playwright-movies-app');

    // Click on the search button to activate the search input
    await page.getByRole('search').click();

    // Type 'Star Wars' into the search input and press Enter
    const searchTerm = 'Star Wars';
    await page.getByRole('textbox', { name: 'Search Input' }).fill(searchTerm);
    await page.getByRole('textbox', { name: 'Search Input' }).press('Enter');

    // Verify we're on the search results page with correct title
    await expect(page).toHaveTitle(`${searchTerm} - Search Results`);

    // Verify the search results heading contains the search term
    await expect(page.getByRole('heading', { level: 1 })).toHaveText(searchTerm);
    await expect(page.getByRole('heading', { name: 'search results', level: 2 })).toBeVisible();

    // Verify that search results are displayed
    await expect(page.getByRole('list', { name: 'movies' })).toBeVisible();

    // Click on a movie from search results
    const firstMovie = page.getByRole('list', { name: 'movies' }).getByRole('link').first();
    const movieTitleElement = firstMovie.getByRole('heading', { level: 2 });
    const movieTitle = await movieTitleElement.textContent() || '';
    await firstMovie.click();

    // Verify that the movie details page is loaded with the correct title
    await expect(page.getByRole('heading', { level: 1 })).toHaveText(movieTitle);

    // Verify movie details sections are present
    await expect(page.getByText('The Synopsis')).toBeVisible();
    await expect(page.getByText('The Cast')).toBeVisible();

    // Verify recommended movies section is present
    await expect(page.getByRole('heading', { name: 'Recommended Movies' })).toBeVisible();

    // Go back to search results
    await page.getByRole('button', { name: 'Back' }).click();

    // Verify we're back on the search results page
    await expect(page.getByRole('heading', { level: 1 })).toHaveText(searchTerm);
  });
});
Enter fullscreen mode Exit fullscreen mode

Once generated, it opens a terminal and runs the test. It passes ✅.

We then open the Trace Viewer in VS Code to visually inspect the steps taken:

  • It searched for Star Wars.
  • Clicked through results like Deadpool.
  • Verified titles on the movie details page.

It’s a full cycle: exploration → generation → execution → review.

💡 Why This Matters
This might seem like magic — but it’s a real example of AI-assisted development.

Here’s what’s powerful about this approach:

  • It caught a real bug I hadn’t seen.
  • It saved me time writing boilerplate.
  • It provided test coverage ideas based on actual usage paths.
  • It produced runnable code I can commit right away or extend into more tests.

You can iterate, refine the prompt, increase test count, or tell the agent to explore different areas. It’s like pairing with an AI-powered tester that never gets tired.

🚀 Try It Yourself
If you're building modern apps and want better test coverage without writing everything by hand, this is your sign to give the Playwright MCP a try.

Just point it at your app, give it a prompt, and let it explore.
You’ll be surprised what it finds — and how quickly you can go from zero tests to real coverage. Test out different models and see what works best for you. For this demo I used Claude Sonnet 3.7.

Check out the video demo:

🧪 Happy testing — and let the bots write your tests. Let me know what you think in the comments and if you tried it out on your site and had some success. It may do things a little different depending on the model and version etc.

Tip: In my .vscode folder in a file called settings.json I add this line of code so I don't have to click continue each time. It's great for demos.

{
    "chat.tools.autoApprove": true
}
Enter fullscreen mode Exit fullscreen mode

Comments 26 total

  • Admin
    AdminJun 19, 2025

    We’re thrilled to announce your special Dev.to drop now live for Dev.to contributors for our top content creators! Click here here (no gas fees). – Admin

  • aakash paliwal
    aakash paliwalJun 19, 2025

    Thanks, Nice one !

  • Moses-Morris
    Moses-MorrisJun 19, 2025

    Great 😊 one. This will save time when testing.

  • Aditya
    AdityaJun 19, 2025

    Cool project. I wanna try it out. Can you please share GitHub link

    • Aditya
      AdityaJun 19, 2025

      Also, I'm also a tech writer myself and dev.to mod. I liked your article and I'll promote this article for better reach to audience.

    • Debbie O'Brien
      Debbie O'BrienJun 20, 2025

      ohh dont have one but will try create one

  • Tarun Varshney
    Tarun VarshneyJun 20, 2025

    Is any data sent outside the local machine by playwright. Copilot would send obviously.

    • Debbie O'Brien
      Debbie O'BrienJun 20, 2025

      no playwright doesnt send or store any data

  • Tommy
    TommyJun 20, 2025

    Next time you ask chatgpt to create an article for you, at least change the formatting and delete the icons so it's not sooo obvious...

    • Debbie O'Brien
      Debbie O'BrienJun 20, 2025

      well actually my process is to record a video and upload my video and ask copilot to create a transcript and then ask chatgpt to create the blog post based on my transcript. this saves tons of time cause i really dont have so much time to share all the cool things I do so thanks for the feedback, will ask chatgpt to keep that in mind for the next blog post I create based on my transcript.

      • Aditya
        AdityaJun 20, 2025

        Exactly for this purpose, I have created myself MCP server to write dev.to articles for me. You can check out my repo - github.com/extinctsion/mcp-py-devto . You can use the MCP tool to generate unpublished articles on dev.to and tweak that already written articles according to your need. It is indeed a game changer for me!

  • Pratik sharma
    Pratik sharmaJun 20, 2025

    This is so great

    • anurag arora
      anurag aroraJun 26, 2025

      @biomathcode When i try to run command " npm @playwright/mcp" it got stuck, can you tell me how to resolve this ?

  • marinsky roma
    marinsky romaJun 20, 2025

    It's a nice concept, but it is not working so smoothly, fortunately, and not

    • Debbie O'Brien
      Debbie O'BrienJun 21, 2025

      It takes lots of tweaking the prompt and the LLMs change too. I think I did this with claude 3.7 but if you try another model you will get different results. Its all still in exploration stages thats for sure but yes its a fun concept indeed and I am using it daily on all my projects. The more I use it the better my prompts get

      • marinsky roma
        marinsky romaJun 21, 2025

        I've tried using it too with Claude 3.7 and understand that the result can be different from time to time.
        I'm seeing MCP perhaps only for agents development, not for Agentic IDEs usage to implement automated tests autonomously. Due to the nature of LLM and the way it works, it is now mostly (IMO) generating searches for "//input", except for "//body" which has text, and less frequent searches by role, with valuable assertions. Too often, I have to remove useless code from LLM's output

        Now, I mostly prefer to use it only for prototyping, brainstorming, and rephrasing or explaining documentation and code, but never for end-to-end development.
        I believe that it's true for every experienced engineer

        • Debbie O'Brien
          Debbie O'BrienJun 21, 2025

          We still always need a human in the loop thats for sure but it gets you off to a great starting point and of course so much can be and should be improved in this area. Thanks for the feedback and for trying it out

  • Nate Liu
    Nate LiuJun 20, 2025

    Hey thanks for sharing. I am curious the exploration part, how does it know what to explore? Did you at least provide some initial hints that this is a movie database website, I’m a movie lover and i usually use this website to do this and that?

    I mean, depends on the user role, the “key functionalities” might differ erent between users roles.

    • Debbie O'Brien
      Debbie O'BrienJun 21, 2025

      It takes a page snapshot so it can see whats on the page. I find it explores pretty much in the order as if you were using the site without a mouse so tabbing along. First thing is search field then team toggler then login etc. you can be more precise and say to ignore certain areas or focus on others or have some tests and ask it to explore and find tests that have not been written. Maybe i will do that as my next post

  • Marek Sirkovský
    Marek SirkovskýJun 21, 2025

    Followed the tutorial(Sonnet 3.5), but got a lot of "I apologize" and the final one was:
    "I apologize, but it seems that the website at debs-obrien.github.io/playwright-m... might be either temporarily unavailable or has been significantly changed. The site is not responding as expected, and the elements we're trying to interact with are not present on the page."

    • Debbie O'Brien
      Debbie O'BrienJun 21, 2025

      Sorry it seems too many people read my blog post and started trying it out so I am guessing my azure function for the movies database times out. Thanks for reporting. Next time will try keeping it a simpler demo. Give it a try again though and let me know

      • Marek Sirkovský
        Marek SirkovskýJun 22, 2025

        Thanks a lot. I used a different model(gtp4-1) today, and it seems to be working. LLM created a simple test, but it tests something useful, though.
        Nice. Thanks for your article!

  • Calvin Szeto
    Calvin SzetoJun 24, 2025

    I’m trying to make this work for my team, and this is such a great start!

    My struggle is with everything that happens after the agent writes the initial test (e.g. in your prompt, the instruction “Execute the test file and iterate until the test passes”).

    With Playwright MCP, the agent has access to the accessibility tree and the network requests, which are critical to debugging and fixing a test the way a human would, particularly for choosing the right selectors and mocking network requests correctly (the latter may not be applicable for everyone; my team mocks API requests in order to test different scenarios).

    I find that when the agent needs to actually get tests passing, it does a loop of analyzing screenshots and writing “debug” tests to output logs, but neither of these do a good job at giving it the information that the MCP does. It ends up writing very complex selectors or not getting the test correct at all.

    It feels like I’m soo close with the amazing tools that the MCP provides, but I can’t actually incorporate it into my testing process! 😞 In the meantime, I can at least use it to scaffold out some initial tests like this demo does, and work from there.

    Thank you for a fantastic demo! Really excited to see where this project goes from here!

    • Debbie O'Brien
      Debbie O'BrienJun 27, 2025

      thanks for the great feedback. yes its still all a testing game literally. playing around with the prompts and seeing what works for you but even if it brings a little bit of value then that is better than nothing. but for sure things will continue to improve. we are only getting started in this space

  • anurag arora
    anurag aroraJun 26, 2025

    when i try to run MCP with : " npx @playwright/mcp" it got stuck, i am using windows system.. and playwright
    Version 1.53.1
    Image description

    can someone help me to fix this ?

Add comment