·
CI/CDGitHub ActionsTutorial

How to Set Up Automated E2E Testing in GitHub Actions: A Complete Guide

Step-by-step guide to running Playwright end-to-end tests in GitHub Actions. Covers workflow configuration, parallel test execution, artifact uploads, caching, and integrating AI testing tools for automated test generation.

There is a specific kind of dread that comes with merging a pull request when you have no automated tests. You click the button, watch the deploy, and then just... hope. Hope that the login flow still works. Hope that the checkout page did not break. Hope that your refactor of the API layer did not silently destroy the settings page.

Automated end-to-end testing in your CI/CD pipeline eliminates that dread entirely. When every pull request triggers a full Playwright test suite against your application, you catch regressions before they reach production. Broken deploys become rare. Code reviews become faster because reviewers can trust the green checkmark. And you sleep better at night.

This is a complete, step-by-step guide to setting up a production-ready GitHub Actions Playwright workflow. We will cover everything from a basic configuration to advanced patterns like parallel sharding, multi-browser matrices, artifact uploads, and secret management. By the end, you will have a robust CI/CD testing pipeline that runs your e2e testing CI/CD suite on every push and pull request.


Prerequisites

Before we start, make sure you have:

  • A project with Playwright installed (npm init playwright@latest)
  • A GitHub repository
  • Basic familiarity with YAML syntax

If you do not have any Playwright tests yet, don't worry. We will cover how AI tools can generate them for you in a later section.


Step 1: A Basic GitHub Actions Playwright Workflow

Let's start with the simplest possible workflow that runs Playwright tests on every push. Create a file at .github/workflows/e2e-tests.yml in your repository:

name: E2E Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps

      - name: Run Playwright tests
        run: npx playwright test

This workflow does four things: checks out your code, installs Node.js and your project dependencies, installs the browser binaries Playwright needs, and runs your test suite. The --with-deps flag is important because it also installs the system-level dependencies (like shared libraries) that Chromium, Firefox, and WebKit require on Ubuntu.

This is a solid starting point, but there is a lot we can improve. Let's build it up step by step.


Step 2: Caching Dependencies for Faster Builds

Installing node_modules and Playwright browsers from scratch on every run is slow. A typical Playwright browser installation can take 60-90 seconds. Caching cuts this down dramatically.

name: E2E Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Cache Playwright browsers
        id: playwright-cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install Playwright browsers
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps

      - name: Install Playwright system deps
        if: steps.playwright-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps

      - name: Run Playwright tests
        run: npx playwright test

There are two caching layers here. The actions/setup-node action has built-in npm caching that handles node_modules. For Playwright browsers, we use actions/cache to store the browser binaries in ~/.cache/ms-playwright. The cache key is based on your lockfile, so browsers are re-downloaded only when your Playwright version changes.

Notice the conditional logic: if the browser cache exists, we skip the full browser install and only install system dependencies (which is much faster). This best practice can shave a minute or more off your GitHub Actions testing workflow.


Step 3: Uploading Test Artifacts on Failure

When a test fails in CI, you need to know why. Playwright can generate screenshots, videos, and trace files on failure. The key is uploading these as GitHub Actions artifacts so you can download and inspect them.

First, configure Playwright to capture artifacts. In your playwright.config.ts:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  retries: process.env.CI ? 2 : 0,
  use: {
    baseURL: 'http://localhost:3000',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
    trace: 'retain-on-failure',
  },
  reporter: [
    ['html', { open: 'never' }],
    ['list'],
  ],
});

Then add an artifact upload step to your workflow:

      - name: Run Playwright tests
        run: npx playwright test

      - name: Upload test artifacts
        uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report
          path: |
            playwright-report/
            test-results/
          retention-days: 14

The if: ${{ !cancelled() }} condition is critical. Without it, the upload step only runs when previous steps succeed, which means you would never upload artifacts from failed test runs, which is exactly when you need them most. This condition ensures artifacts are uploaded whether tests pass or fail, but not if the workflow is manually cancelled.

Setting retention-days: 14 keeps your artifact storage under control. Adjust this based on how quickly your team investigates failures.


Step 4: Running Tests in Parallel with Sharding

As your test suite grows, running everything sequentially gets slow. Playwright has built-in support for sharding, which splits your tests across multiple parallel jobs. This is one of the most impactful optimizations for a continuous testing pipeline.

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1/4, 2/4, 3/4, 4/4]

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Cache Playwright browsers
        id: playwright-cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install Playwright browsers
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps

      - name: Install Playwright system deps
        if: steps.playwright-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps

      - name: Run Playwright tests (shard ${{ matrix.shard }})
        run: npx playwright test --shard=${{ matrix.shard }}

      - name: Upload test artifacts
        uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report-${{ strategy.job-index }}
          path: |
            playwright-report/
            test-results/
          retention-days: 14

This creates four parallel jobs, each running a quarter of your test suite. A few important details:

  • fail-fast: false ensures all shards run to completion even if one fails. You want to see all failures, not just the first one.
  • Each shard uploads its own artifacts with a unique name (playwright-report-0, playwright-report-1, etc.) to avoid conflicts.
  • Playwright automatically distributes tests across shards. You do not need to manually assign tests to shards.

With four shards, a 12-minute test suite becomes a 3-minute test suite. The tradeoff is consuming more GitHub Actions minutes, but for most teams, faster feedback is worth the cost.


Step 5: Controlling When Tests Run

You probably do not want to run your full E2E suite on every single commit to every branch. Here are practical patterns for controlling when your automated testing GitHub Actions workflow triggers.

Run on PRs targeting main, and on pushes to main:

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

Run only when relevant files change:

on:
  pull_request:
    branches: [main]
    paths:
      - 'src/**'
      - 'tests/**'
      - 'playwright.config.ts'
      - 'package.json'
      - '.github/workflows/e2e-tests.yml'

Run the full suite on main, but only smoke tests on PRs:

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    steps:
      # ... setup steps ...

      - name: Run smoke tests (PR)
        if: github.event_name == 'pull_request'
        run: npx playwright test --grep @smoke

      - name: Run full test suite (main)
        if: github.event_name == 'push'
        run: npx playwright test

This uses Playwright's --grep flag with test tags. Tag your critical path tests with @smoke in their test titles:

test('user can log in and view dashboard @smoke', async ({ page }) => {
  // ...
});

This is a practical approach for larger test suites. Smoke tests on PRs give fast feedback (under two minutes), while the full suite on main ensures nothing slips through.


Step 6: Multi-Browser Testing with a Matrix Strategy

Playwright supports Chromium, Firefox, and WebKit. Running your tests across all three catches browser-specific bugs. The matrix strategy makes this straightforward:

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        browser: [chromium, firefox, webkit]

    steps:
      # ... setup and install steps ...

      - name: Run Playwright tests (${{ matrix.browser }})
        run: npx playwright test --project=${{ matrix.browser }}

      - name: Upload test artifacts
        uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report-${{ matrix.browser }}
          path: |
            playwright-report/
            test-results/
          retention-days: 14

For this to work, your playwright.config.ts needs matching project definitions:

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    },
    {
      name: 'firefox',
      use: { ...devices['Desktop Firefox'] },
    },
    {
      name: 'webkit',
      use: { ...devices['Desktop Safari'] },
    },
  ],
});

You can combine the browser matrix with sharding for maximum parallelism, though this multiplies your CI usage (3 browsers times 4 shards equals 12 parallel jobs).


Step 7: Handling Environment Variables and Secrets

Real-world E2E tests often need to authenticate against your application. Maybe you are testing a login flow, or your tests run against a staging environment that requires API keys. GitHub Actions secrets make this secure.

First, add your secrets in your GitHub repository settings under Settings, then Secrets and variables, then Actions. Then reference them in your workflow:

      - name: Run Playwright tests
        run: npx playwright test
        env:
          BASE_URL: ${{ vars.STAGING_URL }}
          TEST_USERNAME: ${{ secrets.TEST_USERNAME }}
          TEST_PASSWORD: ${{ secrets.TEST_PASSWORD }}
          API_KEY: ${{ secrets.STAGING_API_KEY }}

In your Playwright config, use these environment variables:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
  },
});

And in your tests:

test('authenticated user can access dashboard', async ({ page }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill(process.env.TEST_USERNAME!);
  await page.getByLabel('Password').fill(process.env.TEST_PASSWORD!);
  await page.getByRole('button', { name: 'Sign in' }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});

A few best practices for secrets in your CI/CD testing pipeline:

  • Never hardcode credentials in test files or config. Always use environment variables.
  • Use separate test accounts with limited permissions, not real user accounts.
  • Rotate credentials regularly and use GitHub's secret scanning to catch leaks.
  • Use vars for non-sensitive values (like staging URLs) and secrets for sensitive ones (like passwords). Variables are visible in logs while secrets are masked.

Step 8: Starting Your Application in CI

If your tests run against a local development server rather than a deployed staging environment, you need to start your app as part of the workflow. Playwright has a built-in webServer option for this:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  webServer: {
    command: 'npm run start',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
    timeout: 120000,
  },
  use: {
    baseURL: 'http://localhost:3000',
  },
});

The webServer config tells Playwright to start your app before running tests and wait until the URL responds. Setting reuseExistingServer: !process.env.CI means it will start a fresh server in CI but reuse your running dev server locally.

For applications that need a build step, chain the commands:

webServer: {
  command: 'npm run build && npm run start',
  url: 'http://localhost:3000',
  reuseExistingServer: !process.env.CI,
  timeout: 180000,
},

The Complete Production-Ready Workflow

Here is the full workflow combining everything we have covered, a practical, battle-tested configuration you can adapt for your project:

name: E2E Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

concurrency:
  group: e2e-${{ github.ref }}
  cancel-in-progress: true

jobs:
  e2e-tests:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    strategy:
      fail-fast: false
      matrix:
        shard: [1/4, 2/4, 3/4, 4/4]

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Cache Playwright browsers
        id: playwright-cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install Playwright browsers
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps

      - name: Install Playwright system deps
        if: steps.playwright-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps

      - name: Run Playwright tests (shard ${{ matrix.shard }})
        run: npx playwright test --shard=${{ matrix.shard }}
        env:
          BASE_URL: ${{ vars.STAGING_URL }}
          TEST_USERNAME: ${{ secrets.TEST_USERNAME }}
          TEST_PASSWORD: ${{ secrets.TEST_PASSWORD }}

      - name: Upload test artifacts
        uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report-${{ strategy.job-index }}
          path: |
            playwright-report/
            test-results/
          retention-days: 14

Notice the concurrency block at the top. This cancels in-progress test runs when a new commit is pushed to the same branch. Without this, pushing a quick fix after a failing test run would queue up two full runs, wasting CI minutes.


Automating Test Creation with AI

Everything we have covered so far assumes you already have Playwright tests to run. But here is the uncomfortable truth: writing comprehensive E2E tests is time-consuming, and most teams under-invest in test coverage because of it. You set up a beautiful CI pipeline but only have a handful of tests in it.

This is where AI-powered test generation changes the equation. Instead of manually writing every test, you can use AI tools to generate Playwright tests automatically, then run them in the exact GitHub Actions pipeline we just built.

Plaintest takes this approach to its logical conclusion. You point it at your application URL, and it autonomously explores every page, form, button, and user flow. It then generates real, exportable Playwright test code based on what it discovers. These are not abstract test descriptions or pseudo-code. They are actual Playwright tests with proper selectors, assertions, and error handling that you can drop directly into your tests/ directory and run Playwright tests automatically in your CI pipeline.

The workflow looks like this:

  1. Connect your project to Plaintest and let the AI explore your application
  2. Review the generated tests and export them to your repository
  3. Your existing GitHub Actions workflow picks them up and runs them on every PR

This solves the cold-start problem that kills most testing initiatives. Instead of spending days writing your initial test suite, you get a comprehensive baseline in minutes. You can then refine and extend those tests as your application evolves.

Plaintest also detects issues during exploration, including JavaScript exceptions, network failures, accessibility violations, and visual regressions, giving you a broader quality signal beyond just pass/fail test results. For teams practicing continuous testing, this combination of automated test generation and CI execution creates a feedback loop where new code is always validated against a comprehensive, AI-maintained test suite.


Troubleshooting Common Issues

Even with a solid configuration, you will hit issues. Here are the most common problems and their solutions.

Tests pass locally but fail in CI. This is almost always caused by timing. CI runners are slower than your local machine, so elements take longer to appear. Increase timeouts in your Playwright config and use expect with auto-waiting instead of explicit waits. Also check if your application behaves differently based on screen size, since CI runs headless browsers at a default viewport.

Browser installation fails. This usually means system dependencies are missing. Make sure you are using --with-deps when installing browsers. If you are using a custom Docker image, you may need to install dependencies manually. Check the Playwright Docker docs for the required packages.

Flaky tests. Some tests pass sometimes and fail other times. The retries: 2 option in your Playwright config helps, but the real fix is making tests more resilient. Use role-based selectors (getByRole, getByLabel) instead of CSS selectors. Wait for specific conditions instead of using fixed delays. Isolate test data so tests do not interfere with each other.

Out of disk space. Video recordings and traces consume significant disk space. If you are running many tests with video enabled, you may hit the runner's storage limit. Use retain-on-failure instead of on to only save recordings from failed tests.

Secrets not available in fork PRs. For security reasons, GitHub does not expose secrets to workflows triggered by pull requests from forks. If you accept external contributions, you will need a separate workflow that runs without authentication or uses a manual approval step.


Summary

Setting up automated testing GitHub Actions workflows is one of the highest-leverage investments you can make in your development process. Here is what we covered:

  • Basic workflow with Playwright browser installation and test execution
  • Dependency caching for node_modules and Playwright browsers to speed up runs
  • Artifact uploads so you can debug failures with screenshots, videos, and traces
  • Parallel sharding to cut test execution time proportionally
  • Trigger controls to run smoke tests on PRs and full suites on main
  • Multi-browser testing with matrix strategies for Chromium, Firefox, and WebKit
  • Secret management for authenticated test scenarios
  • Application startup in CI with Playwright's webServer config

The patterns in this guide are battle-tested and scale from small side projects to large production applications. Start with the basic workflow, add caching and artifacts, then layer on sharding and multi-browser testing as your suite grows.

And if writing the tests themselves is the bottleneck, consider using an AI testing tool like Plaintest to generate your initial test suite. The best CI pipeline in the world is useless without tests to run in it.

Now go set up that workflow and stop deploying on hope.