Only this pageAll pages
Powered by GitBook
1 of 64

TestDriver.ai

Overview

Loading...

Loading...

Loading...

Loading...

Loading...

Pro Setup

Loading...

Guides

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Reference

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Security & Privacy

Loading...

Loading...

Loading...

FAQ

Loading...

Overview

TestDriver is a next generation autonomous AI agent for end-to-end testing of web & desktop

TestDriver isn't like any test framework you've used before - it's more like your own QA employee with their own development environment.

  1. Tell TestDriver what to do in natural language

  2. TestDriver looks at the screen and uses mouse and keyboard emulation to accomplish the goal

TestDriver is "selectorless" testing. It doesn't use selectors or static analysis.

Advantages

TestDriver then uses AI vision and hardware emulation to simulate real user on their own computer. This has three main advantages:

  • Easier set up: No need to add test IDs or craft complex selectors

  • Less Maintenance: Tests don't break when code changes

  • More Power: TestDriver can test any application and control any OS setting

Just tell TestDriver what to do

Use our CLI to tell TestDriver what to do, like so:

> open google chrome
> navigate to airbnb.com
> search for destinations in austin tx
> click check in
> select august 8

Possibilities

As you can imagine, a specialized QA agent with it's own computer is extremely powerful. TestDriver can:

  • Test any user flow on any website in any browser

  • Clone, build, and test any desktop app

  • Render multiple browser windows and popups like 3rd party auth

  • Test <canvas>, <iframe>, and <video> tags with ease

  • Use file selectors to upload files to the browser

  • Resize the browser

  • Test chrome extensions

  • Test integrations between applications

The problem with current approach to end-to-end testing

End-to-end is commonly described as the most expensive and time-consuming test method. Right now we write end-to-end tests using complex selectors that are tightly coupled with the code.

You've probably seen selectors like this:

const e = await page.$('div[class="product-card"] >> text="Add to Cart" >> nth=2');

This tight coupling means developers need to spend time to understand the codebase and maintain the tests every time the code changes. And code is always changing!

End-to-end is about users, not code

In end-to-end testing the business priority is usability. All that really matters is that the user can accomplish the goal.

TestDriver uses human language to define test requirements. Then our simulated software tester figures out how to accomplish those goals.

Old and Busted (Selectors)
New Hotness (TestDriver)

buy the 2nd product

These high level instructions are easier to create and maintain because they are loosely coupled from the codebase. We're describing a high level goal, not a low level interaction.

The tests will still continue to work even when the junior developer changes .product-card to .product.card or the designers change Add to Cart to Buy Now . The concepts remain the same so the AI will adapt.

How exactly does this work?

TestDriver's AI is a fine tuned model developed over the course of more than a year with the help of computer vision experts, custom research tooling, and a few million dollars in funding (thanks VCs).

TestDriver uses a combination of reinforcement learning and computer vision. The context from successful text executions inform future executions. Here's an example of the context our model considers when locating a text match:

Context
What is it?
Touchpoint

Prompt

Desired outcome

User Input

Screenshot

Image of computer desktop

Runtime

OCR

All possible text found on screen

Runtime

Text Similarity

Closely matching text

Runtime

Redraw

Visual difference between previous and current desktop screenshots

Runtime

Network

The current network activity compared to a baseline

Runtime

Execution History

Previous test steps

Runtime

System Information

Platform, Display Size, etc

Runtime

Mouse Position

X, Y coordinates of mouse

Runtime

Description

An elaborate description of the target element including it's position and function

Past Execution

Text

The exact text value clicked

Past Execution

div[class="product-card"] >> text="Add to Cart" >> nth=2

Quickstart

Instantly Generate and Deploy UI Tests

With TestDriver’s AI-powered QA Agents, creating and deploying robust UI tests has never been faster or easier. Here's how to get up and running in no time.

Generate your first test suite

The fastest way to create your first set of tests is by running the demo. TestDriver will automatically log into your application, generate a dozen UI, and push them directly to your GitHub repository.

Follow the guide here to generate your first test suite.

Write your own tests with the Computer-Use agent

Getting started with TestDriver is simple. Just download the TestDriver Agent to enable test generation directly from your environment.

Need Help? Watch the Setup Video

Follow along with our step-by-step YouTube setup guide to see how easy it is to install the agent, connect your app, and start generating tests—no guesswork needed.

Set Up Your First TestDriver Project

To get started, install the testdriverai package globally using Yarn. This package enables you to generate tests effortlessly using natural language commands.

yarn add global testdriverai@beta

Next, create a new directory for your test project and initialize it with testdriverai init.

mkdir testdriver-first-project
cd testdriver-first-project
testdriverai init

This will launch the interactive setup process:

Beginning setup...
✔ Use TestDriver Runners? (Recommended) … yes
? API KEY (from https://app.testdriver.ai/team) ›

TestDriver Runners are the fastest way to get started, but you'll need an TestDriver Pro plan. See Getting an API Key.

Every TestDriver Pro plan comes with $100 of free credits

You can use your own computer as a runner for free. See Local Agent Setup

How To Get an API Key

Paste your API key into the prompt when asked, then follow the remaining steps in the setup to complete your project configuration.

Build tests with the Computer-Use Agent

Now comes the fun part—building your tests! To get started, trigger the Computer-Use Agent by running:

testdriverai

This launches TestDriver in interactive mode. The agent will begin listening for commands and automatically generate test steps based on your input. All generated commands will be saved to testdriver/testdriver.yaml

Give Instructions in Plain English

Now, just tell TestDriver what you want it to do. For our example, try click new task.

> click new task

What Happens Next

TestDriver will analyze your screen, interpret your instruction, and generate a full test script for you. It can see your UI, control the mouse and keyboard, and automate complex workflows—all through simple commands.

See the yml TestDriver generated? That's our own schema. You can learn more about it in the Test Steps.

> click new task

    thinking...

    To accomplish the goal of clicking "New Task" on the
    webpage, we need to focus on the browser window and then
    click the "New Task" button.

    Here is the YAML code to achieve this:

    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: New Task
        description: button to create a new task
        action: click

command='focus-application' name='Google Chrome'
command='hover-text' text='New Task' description='button to create a new task' action='click'

    thinking...

    Based on the provided coordinates and text, the element
    that matches "New Task" with the description "button to
    create a new task" is at coordinates (864, 328).

    Here is the YAML code to click on that element:

    commands:
      - command: click
        x: 864
        y: 328
        click: single
        button: left

command='click' x=864 y=328 click='single' button='left'

Keep going!

You can continue instructing TestDriver with natural commands. It will visually inspect your app and generate the next steps for you. Try:

> enter a task title
> enter a task summary
> click create task
> delete the task

Test It Locallay Before Deployment

Before pushing to GitHub, it’s a good idea to validate your test locally. After you’ve saved the test using /save, run:

testdriverai run testdriver/testdriver.yml

Deploy tests to GitHub

Subscribe to TestDriver Pro in the to access your API key. Every Pro plan comes with $100 of free credits to help you hit the ground running.

If something didn't work, you can use to remove all of the test steps added since the last prompt.

Ready to ship your tests? Check out our CI/CD integration guides to learn how to and automate them in your pipeline.

Generate a Test Suite
TestDriver Dashboard
/undo
deploy tests to GitHub

Generate a Test Suite

TestDriver Test Generation will explore your app and generate tests.

Test Generation works best with publicly accessible web apps.

The fastest way to get started with TestDriver

TestDriver will crawl your app and generate tests. It interacts with each element and continues to come up with new ideas.

TestDriver can generate 1,000s of tests automatically. Tests are opened as Pull Requests. All you need ot do is review the created tests and merge them into your test suite!

Demo

The easiest way to generate tests for your app is to use the form below. This will generate up to 10 tests for you in a new GitHub repository.

Reviewing Pull Requests

TestDriver will generate tests for your app. Each pull request description will contain:

  1. The Interactive Commands used to generate the test (description)

  2. The test result

The pull request will also contain a Test Steps. Here is an example of a generated test

Debugging Pull Requests

Click on the Dashcam preview to see a video of the AI completing the test as well as the AI logs, console logs, and network requests. For more on reviewing test results, see #Debugging Test Runs.

Making Changes

TestDriver may try multiple times to accomplish a goal.

Not every test will complete the objective entirely, but the steps that do complete successfully make great regression tests and it usually only takes a little modification to make a successful test!

We recommend:

  1. Deleting the test steps that did not complete successfully.

  2. Modifying the test steps if required. See Test Steps for details on each TestDriver command

You can clone this repository and run the tests locally using the TestDriver CLI.

Deploying Your Test (Requires Upgrade)

This repository is set up to automatically run any merged regression tests! You'll need an API key to do so, and you can only get one by upgrading your account to Pro or higher.

Where to go from here?

  • Read the GitHub Actions guide for more information on integrating with GitHub.

  • Scan the Action Output to learn how to send emails and Slack notifications on test failures

  • Check out the Interactive Commands for details on the high level commands TestDriver uses to generate tests or learn about regression tests in Test Steps

  • Follow the Local Agent Setup guide to create and debug tests on your local machine

Getting an API Key

How to get a TestDriver API key.

Every TestDriver Pro plan comes with $100 in usage credits!

Deploying your TestDriver tests to CI/CD requires an API key!

  1. Upgrade your account to TestDriver Pro (Every upgrade comes with $100 in free usage credits!)

  2. Copy the API key from your developer portal

GitHub Actions

Run tests on your local computer, sandbox, or on hosted runners.

Running in CI/CD

CI/CD pipelines enhances software quality and release velocity. Adding TestDriver to your CI/CD, you can catch defects early, prevent regressions, and gain rapid feedback on code changes.

The TestDriver GitHub action will execute tests when you mention it in a comment, on a schedule, or anywhere else in your CI/CD workflow.

Continuous Integration

TestDriver can be integrated with GitHub Actions to automatically run tests whenever code is pushed to a repository. This ensures that new changes do not break existing functionality.

Automated Code Reviews

TestDriver can be configured to run specific test suites when a pull request is created. This ensures that any code changes meet the project's quality standards before being merged.

Increased Test Coverage

TestDriver can automatically generate and run test cases across multiple environments, providing detailed feedback on test coverage and potential issues.

Improved Developer Experience

Developers get immediate feedback on the success or failure of their changes within the pull request itself, reducing the time spent on manual code reviews.

A video recording to review (with logs and network requests) on

On've you've acquired an API key, create a named TESTDRIVER_API_KEYand use the key as the value.

Head over to to get an API key

Our allows you to run the tests you designed on your local computer in a virtual machine as part of your CI/CD workflow.

Dashcam
Getting an API Key
new GitHub secret
GitHub Action Setup
https://app.testdriver.ai

Comparison

TestDriver vs Playwright vs Selenium

Application Support

TestDriver opeates a full desktop environment, so it can run any application.

Application
TestDriver.ai
Playwright
Selenium

Web Apps

Desktop Apps

Chrome Extensions

Testing Features

TestDriver is AI first.

Feature
TestDriver.ai
Playwright
Selenium

Test Generation

Adaptive Testing

Visual Assertions

Self Healing

Application Switching

GitHub Actions

Team Dashboard

Team Collaboration

Test Coverage

Testdriver has more coverage than selector based frameworks.

Feature
TestDriver.ai
Playwright
Selenium

Browser Viewport

Browser App

Operating System

PDFs

File System

Push Notifications

Image Content

Video Content

<iframe>

<canvas>

<video>

Debugging Features

Feature
TestDriver.ai
Playwright
Selenium

AI Summary

Video Replay

Browser Logs

Desktop Logs

Network Requests

Team Dashboard

Team Collaboration

Web Browser Support

TestDriver is browser agnostic and supports any verson of any browser.

Feature
TestDriver.ai
Playwright
Selenium

Chrome

Firefox

Webkit

IE

Edge

Opera

Safari

Operating System Support

TestDriver currently supports Mac and Windows!

Feature
TestDriver.ai
Playwright
Selenium

Windows

Mac

Linux

GitHub Action Setup

Installing the GitHub action

Get your API Key

You'll need a paid account to use the TestDriver GitHub action

In order to execute your TestDriver actions on our VMs you'll need to add your API key as a GitHub secret. If you don't see an API key, you'll need ot upgrade your account.

Create your workflow

Now it's time to create your first TestDriver workflow.

In .github/workflows/testdriver.yml add the following code.

If you used testdriverai init to create your TestDriver project, these files will already be in your repository.

Notice that on line 23 we're invoking the agent. In this case we're using the /run command to execute our file from our local directory.

You can also use commands like /explore, use variables, or supply your prompts dynamically from earlier steps. A common workflow is wait for staging to deploy before executing the test.

Output

How it works

  1. The GitHub action is triggered via the conditions supplied via on

  2. The key value is used to authenticate you

  3. An ephemeral virtual machine is spawn on our infrastructure

  4. The code from the current branch will be cloned on to the VM

  5. prompt is parsed as a markdown list.

  6. Each list item from promp is fed into TestDriver one by one

  7. TestDriver summarizes the test and sets it's exit code depending on it's pass or failed state

  8. The VM is destroyed and all data is wiped

Deploy the test

Save the file, make a new branch, push to your repository, and create a new pull request.

This will trigger a new TestDriver execution.

Debugging features are powered by .

TestDriver Cloud Testing is performed via . You can learn more by visiting the marketplace page.

Log in the team page in and .

Paste the API key as a named TESTDRIVER_API_KEY

See for output definitions and code samples.

begins recording

If supplied, the

ends recording

Dashcam.io
name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  pull_request: # run on every PR event
  schedule:
    - cron: '0 * * * *' # run every hour
  push:
    branches:
      - main # run on merge to the main branch
  workflow_dispatch:

jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    steps:
      - uses: dashcamio/testdriver@main
        version: "v5.0.7"
        key: ${{secrets.TESTDRIVER_API_KEY}}
        os: linux
        with:
          prompt: |
            1. /run testdriver/testdriver.yaml
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
git checkout -b testdriver
git commit -am 'add testdriver github action'
git push origin testdriver
gh pr create --web
Action Output

Prerun Scripts

How to provision the VM and build your app before running TestDriver

Prerun scripts are Bash commands executed on a TestDriver VM before each test within a CI/CD pipeline. Their primary purpose is to establish the state of a machine. This ensure it is consistent before every test execution.

You can configure prerun script to install necessary dependencies, build your application, set specific configurations, and more.

This crucial step helps to speed up the setup of an environment, prepare for a test suite to run, and prevent test failures due to environment inconsistencies and promote reproducible builds, ultimately enhancing the overall test suite's effectiveness.

Example

This is an example of how to download Arc Browser and use it instead of Chrome.

# permissions and other setup here

jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    steps:    
    # Download an exe for this test
    - uses: testdriverai/action@main
      with:
        prerun: |
          Get-NetIPAddress -AddressFamily IPv6
          # URL for the Arc browser installer
          $installerUrl = "https://releases.arc.net/windows/ArcInstaller.exe"
          # Location to save the installer
          $installerPath = "$env:USERPROFILE\Downloads\ArcInstaller.exe"
          # Download the Arc browser installer
          Write-Host "Downloading Arc browser installer..."
          Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath
          # Check if the download was successful
          if (Test-Path $installerPath) {
              Write-Host "Download successful. Running the installer..."
              Start-Process -FilePath $installerPath -ArgumentList '/silent' -Wait
              Start-Sleep -Seconds 10
          } else {
              Write-Host "Failed to download the Arc browser installer."
          }

FAQ

🔧 Product Capabilities

  • What is TestDriver? TestDriver is an AI-powered testing platform that simulates user interactions to automate end-to-end testing for web, desktop, and mobile applications.

  • How does TestDriver work? It interprets high-level prompts, interacts with interfaces like a user would, and verifies expected outcomes using assertions and visual validation.

  • What platforms does TestDriver support? TestDriver supports Windows, Mac, Linux desktop apps, web browsers, and mobile interfaces (via emulator or device farm).

  • Can it be used for exploratory testing? Yes. TestDriver can autonomously navigate the application to discover potential issues or generate new test cases.

  • Can it test desktop applications? Yes. It supports testing native desktop applications by simulating mouse and keyboard input and identifying UI elements.

  • Can it test mobile apps? Yes, via mobile emulators or integration with device farms.


🤖 Test Creation and Generation

  • Can TestDriver generate tests automatically? Yes, it explores the app and creates test cases based on UI flows and user interactions.

  • Can I create tests from natural language prompts? Yes. You can write high-level instructions in plain language, and TestDriver will interpret and build tests from them.

  • Can it generate tests from user stories or documentation? Yes. It can use minimal descriptions to produce complete test cases.

  • Can it turn recorded user sessions into tests? Yes, in supported environments, TestDriver can generate test steps from interaction logs or screen recordings.


🛠️ Test Maintenance and Resilience

  • What happens when the UI changes? TestDriver adapts using AI—if a button or label changes, it can often infer the correct action without breaking.

  • Do I need to rewrite tests often? No. TestDriver reduces maintenance by handling common UI changes automatically.

  • How does it handle flaky tests? It retries failed actions, assigns confidence scores, and logs inconsistencies so you can investigate root causes.

  • How are tests updated over time? You can regenerate them using updated prompts or manually edit the test specs.


🚨 Failures, Debugging, and Feedback

  • How does TestDriver report test failures? It provides detailed logs, screenshots, console output, and visual diffs.

  • What happens when a test fails? It stops execution, flags the failing step, and provides context for debugging.

  • Can I view why a test failed? Yes. You can view step-by-step logs, network traffic, DOM state, and video playback of the test run.

  • Can it automatically retry failed actions? Yes. You can configure retry behavior for individual steps or full tests.


🚀 Performance and Parallelism

  • Can I run tests in parallel? Yes. TestDriver supports parallel execution using multiple VMs or containers.

  • Can I track performance metrics during testing? Yes. It can log CPU, memory, load times, and frame rates to help catch performance regressions.


🔍 Advanced Testing Features

  • Can it validate non-deterministic output? Yes. It uses AI assertions to verify outcomes even when outputs vary (e.g., generated text or dynamic UIs).

  • Can it test workflows with variable inputs? Yes. It supports data-driven tests using parameterized inputs.

  • Can it test file uploads and downloads? Yes. TestDriver can interact with file pickers and validate uploaded/downloaded content.

  • Can it generate tests for PDFs or document output? Yes. It can open and verify generated files for expected text or formatting.

  • Can I trigger tests based on pull requests or merges? Yes. You can integrate TestDriver with your CI to trigger runs via GitHub Actions or other CI/CD tools.


🧩 Integration and Setup

  • Does it integrate with CI/CD tools? Yes. TestDriver integrates with pipelines like GitHub Actions, GitLab CI, and CircleCI.

  • Can I integrate TestDriver with Jira, Slack, etc.? Yes. You can receive alerts and sync test results with third-party tools via API/webhooks.

  • Does it support cloud and local environments? Yes. You can run tests locally or in the cloud using ephemeral VMs for clean state testing.

  • Does it work with existing test frameworks? It can complement or convert some existing test cases into its format, though full conversion depends on compatibility.


📊 Test Coverage and Effectiveness

  • How does TestDriver measure test coverage? It tracks UI paths, element interaction frequency, and application state changes to infer coverage.

  • Can it suggest missing test scenarios? Yes. Based on interaction patterns and user behavior, it can propose additional test cases.

  • Can it analyze test stability over time? Yes. You can view trends in pass/fail rates and test execution consistency.


🔒 Security and Compliance

  • Is it safe to test sensitive data? Yes. TestDriver supports variable obfuscation, secure containers, and test data isolation.

  • Can I avoid using production data in tests? Yes. You can configure mock data, sanitize logs, and use test-specific environments.


🧠 AI Behavior and Prompting

  • How does the AI understand what to test? It uses language models to interpret your goals, element names, and interface cues to perform tasks.

  • Can I adjust how the AI interprets my prompt? Yes. You can rewrite prompts, add constraints, or review and tweak auto-generated steps.

  • Can I see what the AI is doing behind the scenes? Yes. You can inspect the resolved steps, see element matches, and adjust test flows before execution.


📦 Use Cases and Scenarios

  • Can I use TestDriver to test new features? Yes. It’s great for validating changes, ensuring no regressions, and verifying rollout configurations.

  • Can it test seasonal or time-based behaviors? Yes. You can schedule tests or run them under specific date/time settings to verify time-sensitive logic.

LogoTestDriver.ai / Generate Test / test-contact-us-link by github-actions[bot] · Pull Request #6 · testdriverbot/pubnubcom-1729121039116GitHub
our GitHub action
copy your API key from there
new GitHub secret
Dashcam
prerun shell script runs
Dashcam

Environment Config

How to change the operating system platform.

To configure the operating system, simply supply the os param to the GitHub action. The current available operating systems are windows and mac.

Mac execution is only available to Enterprise customers.

Key
Version
Instance Type
Architecture

windows

Windows Server 2022 Base 10

t2.large

64-bit (x86)

mac

MacOS Sanoma

mac1.metal

x86_64_mac

Prerun files

Note that prerun files are executed on these machines themselves. You must use powershell on Windows and bash on mac.

Testing multiple browsers and operating systems

name: Test action

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  pull_request:
    types: [opened, synchronize, reopened, ready_for_review, labeled, unlabeled]

jobs:
  test-action:
    name: Test action
    runs-on: ubuntu-latest
    strategy:
      matrix:
        os: [windows, mac]
        browser: [chrome, firefox]
    steps:
      - name: Set up Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '16'

      - name: Install Dashcam Chrome
        run: |
          npm init -y
          npm install dashcam-chrome

      - uses: replayableio/testdriver-action@main
        with:
          prompt: |
            1. open youtube
            2. find a cat video
            3. quit the browser
            4. /summarize
          os: ${{ matrix.os }}
          prerun: |
            if [ "${{ matrix.browser }}" == "chrome" ]; then
              if [ "${{ matrix.os }}" == "windows" ]; then
                # Install Google Chrome on Windows
                $ProgressPreference = 'SilentlyContinue'
                Invoke-WebRequest -Uri "https://dl.google.com/chrome/install/latest/chrome_installer.exe" -OutFile "$env:TEMP\chrome_installer.exe"
                Start-Process -FilePath "$env:TEMP\chrome_installer.exe" -ArgumentList "/silent", "/install" -Wait
              else
                # Install Google Chrome on macOS
                brew install --cask google-chrome
              fi
            else
              if [ "${{ matrix.os }}" == "windows" ]; then
                # Install Firefox on Windows
                $ProgressPreference = 'SilentlyContinue'
                Invoke-WebRequest -Uri "https://download.mozilla.org/?product=firefox-latest&os=win64&lang=en-US" -OutFile "$env:TEMP\firefox_installer.exe"
                Start-Process -FilePath "$env:TEMP\firefox_installer.exe" -ArgumentList "/S" -Wait
              else
                # Install Firefox on macOS
                brew install --cask firefox
              fi
            fi
            if [ "${{ matrix.browser }}" == "chrome" ]; then
              if [ "${{ matrix.os }}" == "windows" ]; then
                Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized","--load-extension=$(pwd)/node_modules/dashcam-chrome/build" &
              else
                /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --start-maximized --load-extension=$(pwd)/node_modules/dashcam-chrome/build &
              fi
            else
              if [ "${{ matrix.os }}" == "windows" ]; then
                Start-Process "C:/Program Files/Mozilla Firefox/firefox.exe" -ArgumentList "--start-maximized" &
              else
                /Applications/Firefox.app/Contents/MacOS/firefox --start-maximized &
              fi
            fi
          key: ${{ secrets.TESTDRIVER_API_KEY }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"

Local Agent Setup

TestDriver's local agent runs tests on your computer.

You can use your own computer as a TestDriver runner. TestDriver will 'alt-tab' between the terminal and your computer.

Just tell TestDriver what to test on your local machine and TestDriver will look at the screen, generate some commands to control the computer, and execute them.

Extra Setup

Running tests on your own machine requires a bit of extra setup. You'll need to install some dependencies and give the terminal permission to see your screen and file system access.

NodeJS

You'll need NodeJS version 20 to get started. Windows users will need a couple extra tools.

Install TestDriver via NPM

npm install testdriverai -g

Set up the project

Run testdriverai init in a new folder.

testdriverai init

This will walk you through setting up a local project.

Set up your test environment

Before we get started, let's set up your machine to collaborate with TestDriver.

Display

TestDriver isn't like any framework you've used before. TestDriver makes decisions based on what it can see on your display!

TestDriver only knows about what it can see on your primary display!

For now, set up your environment with a browser window and your terminal side by side like so:

When you enter commands into TestDriver, the current terminal window will minimize and the focus-window command will bring Chrome or other applications to the foreground.

Application State

The application you want to test should be visible before you run the testdriverai command.

For our example, make a new incognito window in Chrome and load our test webpage:

https://testdriverai.github.io/example-react-todo/

Make sure to reset the test state!

Storing Secrets

How to securely store and use username, password, and other secrets within TestDriver

You'll likely want TestDriver to log in to your app as a test user, but you wouldn't want to expose that password to the world.

Open your test file (.yml) in a code editor and replace your secrets with ${TD_YOUR_SECRET} .

TestDriver will only parse and mask secrets that begin with TD_ .

Add the secrets to your GitHub repository

First, configure the secrets within your GitHub repository.

Follow the guide here for detailed instructions on how to add a secrete to your GitHub repo or organization.

Action Output

You can chain GitHub actions together to create awesome workflows.

The TestDriver action outputs the following variables. You can chain mlutliple actions together to post TestDriver results as comments, send an email on failure, or upload them to 3rd party test reporting software.

Example

Here's an example of creating a comment on the PR after every execution.

Test Generation

Generate tests with natural language prompts.

Pricing

TestDriver Test Runners are billed per minute they run.

All TestDriver Pro Plans start with $100 in free credits!

Local Runners

Hosted Linux Runners

Hosted Windows Runners

If you're building a desktop app, need more coverage, more power, or want to test more complex flows, we recommend using our hosted Windows Runners.

Click here to install the GitHub action in your repository.

The following example will run the same test using 4 different configurations by utilizing .

Report issues to

Install tools with

Install Python & Visual Studio Build Tools

choco install python visualstudio2022-workload-vctools -y

Install NodeJS

You will also need NodeJS if you don't have it yet.

choco install nodejs-lts --version="20.17.0"

Set Execution Policy

Open a new terminal with admin privileges and execute the following command :

Set-ExecutionPolicy RemoteSigned -Scope CurrentUser

This gives TestDriver the right to execute it's scripts and is only valid for the current user.

Install tools with

You can install winget if you haven't already by downloading the .

Once its downloaded you need to add it to the system PATH environment variable. It's installed here:

C:\Users\<your-username>\AppData\Local\Microsoft\WindowsApps 

Don't forget to replace it with your actual username.

Install Python

winget install python

Install Visual Studio Build Tools

winget install --id=Microsoft.VisualStudio.2022.Community --override "--add Microsoft.VisualStudio.Workload.VCTools" --accept-package-agreements --accept-source-agreements

Install NodeJS

winget install --id=OpenJS.NodeJS.LTS --version="20.17.0"

Set Execution Policy

Open a new terminal with admin privileges and execute the following command :

Set-ExecutionPolicy RemoteSigned -Scope CurrentUser

This gives TestDriver the right to execute it's scripts and is only valid for the current user.

Install NodeJS

brew install nvm

Install NodeJS 20

nvm install 20.17.0

Install via NPM. This will make testdriverai available as a global command.

Thankfully TestDriver provides a way to securely use secrets. Secrets will be masked from all test output, including logs.

Replace the hardcoded secrets within your

Here is an example of with the secret syntax.

Supply the secrets to the

When using the to spawn tests, supply your TD_SECRET within the env: value provided to the GitHub action.

This workflow will generate tests using the exploratory prompts found in the promptvalue of the configuration.

The promptvalue takes a as input.

Runner
Local
Sandbox
Hosted
Custom

Follow the to run on your own machine for free. You'll use your own computer to create and run tests.

We recommend building and running your tests with our linux. You can run multiple tests in parallel and deploy them to CI/CD with our .

Install using :

the GitHub matrix strategy
the testdriverai/testdriverai repo
Chocolatey
WinGet
App installer from Microsoft store
NVM
brew
testdriverai
Prompting
version: 4.1.35
steps:
  - prompt: sign in with username and password
    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: Email or phone
        description: email input field
        action: click
      - command: type
        text: ${TD_USERNAME}
      - command: hover-text
        text: Next
        description: next button after entering email
        action: click
      - command: hover-text
        text: Password
        description: password input field
        action: click
      - command: type
        text: ${TD_PASSWORD}
env:
    TD_USERNAME: ${{ secrets.TD_USERNAME }}
    TD_PASSWORD: ${{ secrets.TD_PASSWORD }}
permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write
  
jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    steps:
      - uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: | 
            1. /run tests/signin.yml
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
          TD_USERNAME: ${{ secrets.TD_USERNAME }}
          TD_PASSWORD: ${{ secrets.TD_PASSWORD }}
LogoTestDriver.ai - GitHub MarketplaceGitHub

Output Variable

Description

summary

Contains the TestDriver AI text summary result of the action execution.

link

Link to the Dashcam dash. See Debugging Test Runs

markdown

Contains the markdown-formatted shareable link. This includes a screenshot of the desktop!

success

Indicates whether the action passed successfully (true or false).

name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  pull_request:

jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    id: run-testdriver
    steps:
      - uses: dashcamio/testdriver@main
        version: v4.0.0
        key: ${{secrets.TESTDRIVER_API_KEY}}
        with:
          prompt: |
            1. /run /Users/ec2-user/actions-runner/_work/testdriver/testdriver/.testdriver/test.yml
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
    - name: Create comment on PR
      if: ${{ always() }}
      uses: peter-evans/create-or-update-comment@v3
      with:
        issue-number: ${{steps.get_issue_number.outputs.result}}
        body: |
          ${{ needs.run-testdriver.outputs.summary }}
          ${{ needs.run-testdriver.outputs.markdown }}
name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write
  
on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:
  schedule:
    - cron: "0 0 * * *"

jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    steps:
      - uses: testdriverai/action@main
        with:
          key: ${{secrets.TESTDRIVER_API_KEY}}
          prompt: |
            1. Search for cat pictures
            2. Download the first image to the desktop
            3. Assert the cat picture is saved to the desktop
          prerun: |
            cd $env:TEMP
            npm init -y
            npm install dashcam-chrome
            Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized", "--load-extension=$(pwd)/node_modules/dashcam-chrome/build", "${{ env.WEBSITE_URL }}"
            exit
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
          WEBSITE_URL: "https://example.com" # Define the website URL here
Debugging Test Runs
Test Steps
Test Steps
GitHub Actions
GitHub Actions
GitHub Actions
Local Agent Setup
GitHub Actions
GitHub Action Setup

Optimizing Performance

Optimize your tests to save serious time!

While TestDriver is incredibly smart, using AI matching methods all the time can be slow. Nobody wants to block developers from merging while waiting for tests to run!

Here are some tips for improving TestDriver performance.

Use Parallel Testing

Covered in the previous section, use runand Parallel Testing to split your actions into multiple files and run them.

Use ai matching method

The most common actions like hover-text, wait-for-text , and scroll-until-text use an optimized matching algorithm.

This algorithm uses text similarity to quickly compute the most similar text to what appears in your yml. This is about 40% faster than the ai method!

Usse `async` Asserts

The assert method has property async: true which allows you to create non blocking test assertions costing you almost no time!

Examples

In this section you will find example workflows for a variety of use cases

Test Generation Parallel Testing Importing Tests Desktop Apps Secure Log In

markdown list

Price

Free

$0.05/minute

$0.08/minute

Support

Community

Email

Email & Chat

White-Glove

Linux Sandbox

✅

✅

✅

GitHub Action Setup

✅

✅

Debugging Test Runs

✅

✅

Linux Runners

✅

✅

Windows Runners

✅

✅

Mac Runners

✅

Monitoring Performance

✅

Debugging Test Runs

Debug your test executions with Dashcam!

Dashcam captures the video of the desktop test execution, the browser logs, network requests, prerun script output, and of course the TestDriver test execution. You can play it all back in the TestDriver dashboard!

Dashcam dashes are transmitted over HTTS, encrypted at test, and have "google style" sharing permissions. Sensitive data in logs is masked.

When a test finishes, a Dashcam clip (a dash for short) is into GitHub automatically. This enables developers to see a test artifact and collaborate directly from within GitHub.

Desktop Apps

Download a software installer and run it in a prerun script

This workflow's prerun script downloads an installer, then installs the software before running the tests

This workflow:

  • Runs on each push to the "main" branch & every day at midnight

  • Downloads an installer from the provided URL

  • Runs the installer and install the software

  • Run the provided test

    • Functionalities like 'automatic test matrix population' can be added to this workflow

Monitoring Performance

TestDriver's Performance Agent alerts you of software quality issues and helps you identify the root cause.

Weekly Email Reports

The TestDriver Performance agent will email you weekly with high level summaries of your software quality. Get alerted of new bugs and detect feature regressions.

Every email includes:

  • an single performance score

  • a high level summary of test failures

  • test passing rates per test file

Dashboard

Also included is a high level summary of test passing rates and coverage by feature.

Performance

Measure test completion duration over time.

Secure Log In

Secrets can be passed into the workflow to not expose sensitive credentials

This workflow defines TD_USERNAME and TD_PASSWORD in env: to existing repository secrets of the same name. These TD_USERNAME and TD_PASSWORD can then be used in any YAML passed into the workflow. Secrets must begin with TD_ in order for Testdriver to be able to see them.

Example YAML steps using secrets could look like:

This workflow:

  • Runs on each push to the "main" branch & every day at midnight

  • Defines variables that point to repository secrets

  • Runs the test provided

Every TestDriver CI test is recorded with , the first TestDriver product we built to help debug remote test executions.

You an learn more about Dashcam on .

Contact Us
Dashcam
the Dashcam documentation page
name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write
  
on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:
  schedule:
    - cron: "0 0 * * *"

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: |
            1. /run testdriver/test.yml
          prerun: |
            # Check IPv6 addresses (optional)
            Get-NetIPAddress -AddressFamily IPv6

            # URL for the installer
            $installerUrl = "https://example.com/windows/ExampleInstaller.exe"
            # Location to save the installer
            $installerPath = "$env:USERPROFILE\Downloads\ExampleInstaller.exe"

            # Download the installer
            Write-Host "Downloading installer..."
            Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath

            # Check if the download was successful
            if (Test-Path $installerPath) {
                Write-Host "Download successful. Running the installer..."
                Start-Process -FilePath $installerPath -ArgumentList '/silent' -Wait
                Start-Sleep -Seconds 10
            } else {
                Write-Host "Failed to download the 
                
                installer."
            }

        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
steps:
  - prompt: input email ${TD_TEST_USERNAME} then press continue
    commands:
      - command: hover-text
        text: Email address
        description: email input field label
        action: click
      - command: type
        text: ${TD_USERNAME}
      - command: hover-text
        text: CONTINUE
        description: continue button below the email input
        action: click
  - prompt: input password ${TD_TEST_PASSWORD} then click continue
    commands:
      - command: hover-text
        text: Password
        description: password input field label
        action: click
      - command: type
        text: ${TD_PASSWORD}
      - command: hover-text
        text: CONTINUE
        description: continue button below the password input
        action: click
name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:
  schedule:
    - cron: "0 0 * * *"

jobs:
  test:
    name: "TestDriver"
    runs-on: ubuntu-latest
    steps:
      - name: Run TestDriver.ai Action
        uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: |
            1. /run testdriver/test.yml
          prerun: |
            cd $env:TEMP
            npm init -y
            npm install dashcam-chrome
            Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized", "--load-extension=$(pwd)/node_modules/dashcam-chrome/build", "${{ env.WEBSITE_URL }}"
            exit
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
          WEBSITE_URL: "example.com"
          TD_USERNAME: ${{ secrets.TD_USERNAME }}
          TD_PASSWORD: ${{ secrets.TD_PASSWORD }}
LogoUsing secrets in GitHub Actions - GitHub DocsGitHub Docs

Test Steps

All the valid commands and their properties.

TestDriver will worry about generating and maintaining tests for the most part. However, if you'd like to edit tests or gain a better understanding of what's going on you can find all of the commands in this secion.

As for YML format, here is an example of a valid yml file:

version: 4.0.0
steps:
  - prompt: enter fiber.google.com in url
    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: Search Google or type a URL
        description: main google search
        action: click
      - command: type
        text: fiber.google.com
      - command: press-keys
        keys:
          - enter
  - prompt: enter a fake address and check availability
    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: Enter your address
        description: address input field
        action: click
      - command: type
        text: 123 Fake Street
      - command: hover-text
        text: ZIP
        description: ZIP code input field
        action: click
      - command: type
        text: 12345
      - command: hover-text
        text: Check availability
        description: check availability button
        action: click
  - prompt: assert a familiy appears on screen
    commands:
      - command: focus-application
        name: Google Chrome
      - command: assert
        expect: a family appears on screen

exec

Execute cli commands on the runner

Argument
Type
Description

cli

string

The cli commands to run.

silent

boolean

Log the output?

output

variable

Define the name of a variable for output

Example

version: 4.2.10
session: 67a3fdacd06c99b9c179b566
steps:
  - prompt: exec
    commands:
      - command: exec
        cli: pwd
        silent: false
        output: my_var
  - prompt: type ls
    commands:
      - command: type
        text: ${OUTPUT.my_var}

Example Output

❯ node index.js run testdriver/testdriver.yml
Spawning GUI...
Howdy! I'm TestDriver v4.2.13
Working on /Users/ianjennings/Development/testdriverai/testdriver/testdriver.yml

This is beta software!
Join our Discord for help
https://discord.com/invite/cWDFW8DzPm

running /Users/ianjennings/Development/testdriverai/testdriver/testdriver.yml...

exec
command='exec' cli='pwd' output='my_var'
/Users/ianjennings/Development/testdriverai


type ls
command='type' text='/Users/ianjennings/Development/testdriverai
'
/Users/ianjennings/Development/testdriverai

Prompting

Executes tasks based on user input using natural language processing. This command is invoked when the user input does not start with a / command. The system interprets the input and attempts to carry out the task specified by the user.

Example Usage

Prompting Tips

The agent is selecting the wrong thing!

This is the most common issues encountered with our agent, here are some possible reasons.

You're asking the AI to interact with elements it can not see

TestDriver uses the context from your prompt and the computer screen to make a decision of what commands to run. You should only prompt the AI to interact with elements it can currently see.

Incorrect Prompt

A common example of this is interacting with a dropdown. We often see users prompt the agent to interact with a dropdown and choose a state.

Recommended prompts

Instead, simply treat these as two separate prompts. This allows the UI to render and gives the AI the opportunity to parse the new screen data.

You're asking the AI to click on elements it does not understand

The TestDriver agent relies on visual understanding, not functional. Like any user, the AI does not understand what the function of a button will be. It can only guess.

Incorrect Prompt

Correct Prompts

Describing Images Properly

If you're uncertain of how to describe an icon, simply ask ChatGPT-4o what it would call it, and use that as your input.

The AI can not find small images

Small, isolated images smaller than 15x15px appear like "noise" to the AI and may not be clickable. However, you can use the match-imagecommand to select these using manually made screenshots.

No matter what I do, TestDriver will not select my element.

The AI has trouble selecting some specific elements, like empty gray boxes, some substrings, or conditions where there is a lot of similar text close together.

If that's the case, you can always fall back to . We've seen typically a test suite of 10 tests could require a single screenshot.

> click sign up 
    
    thinking...

    To accomplish the goal of clicking "Sign Up," we need to
    focus on the Google Chrome application and then click on
    the "Sign Up" button.

    Here are the steps:

    1. Focus the Google Chrome application.
    2. Click on the "Sign Up" button.

    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: Sign Up
        description: button in the header
        action: click

command='focus-application' name='Google Chrome'
command='hover-text' text='Sign Up' description='button in the header' action='click'
match-image

focus-application

Focuses an application by name.

Argument
Type
Description

name

string

The name of the application to focus.

Example Usage

command: focus-application
name: Google Chrome

30x30 Promotion

Our team will build and deploy 30 tests in 30 days free!

We know our customers need QA coverage yesterday. While other QA services offer coverage delivered months later for hundreds of thousands of dollars, we'll set you for free during our 30-day trial!

Pricing

We think you're going to love TestDriver, which is why for a limited time our team will build 30 custom tests for you as a part of our risk-free, 30 day trail.

Promotion Details

  • Test any publicly available desktop app, chrome extension, mobile app, or website

  • Choose from 250 AI generated tests within 7 days.

  • Run tests 3 - 5 times every day.

  • AI Quality reports delivered to your email.

  • AI fixes and maintains tests.

  • Integrates with GitHub.

How does it work?

  1. Our support will work with you to create tests for any features TestDriver might have missed.

  2. Our support team will help train your engineers to create additional tests

Onboarding Process

Contract Details

  • Get 30 custom tests free during a 30-day trial when subscribing to $995/m "30x30" plan

  • The tests are yours! You are free to modify, duplicate, or distribute tests however you wish

  • Payment method is required to begin trial

  • Contract renews monthly

  • Cancel any time, for any reason

  • Credits do not "roll over." Unused credits expire every month

Limited time offer! Don't miss your chance! .

Afterward, plans start at just $995/m and include 12,000 runner minutes per month. That's enough minutes to run those 30 tests 2 - 5 times per day. See for more.

Full access to our

See for more.

call so we can learn about the specific flows you want to test

We'll set up to explore your app and generate 100s of tests.

Your new tests will be deployed to our .

Service
Timeline
Description

Additional usage is billed at

Quickly deploy AI QA tests.

Safeguard your most important user flows with generated tests.

Increase your total coverage

Test flows never possible before with our powerful computer-use agent.

Spend less time on maintenance

TestDriver tests automatically repair themselves.

Custom Onboarding

First 7 Days

Get set up quickly with custom workflows and Prerun Scripts developed by the TestDriver team.

AI Test Generation

First 7 Days

Instant coverage. We'll generate hundreds of tests for you to choose from.

Custom Test Creation

First 30 Days

Custom test scenarios. Our team will work with you to create tests for any features TestDriver might have missed.

Training

First 30 Days

Our support team trains your engineers on best practices for creating and maintaining TestDriver tests.

Test Execution

Recurring

Seamless deployment. Our hosted test runners execute your tests on a schedule or via GitHub Actins.

Generate a Test Suite
GitHub Actions
Promotion Details
Contract Details

Parallel Testing

The best way to run tests twice as fast is to split them in half!

Rather than execute your tests sequentially, you can make use of the run command to share common setup and teardown test plans.

The, simply parallelize your test executions by calling the TestDriver action multiple time as a part of two different jobs.

name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  pull_request: # run on every PR event
  schedule:
    - cron: '0 * * * *' # run every hour
  push:
    branches:
      - main # run on merge to the main branch
  workflow_dispatch:

jobs:
  test1:
    name: "TestDriver Test 1"
    runs-on: ubuntu-latest
    steps:
      - uses: dashcamio/testdriver@main
        version: v4.0.0
        key: ${{secrets.TESTDRIVER_API_KEY}}
        with:
          prompt: |
            1. /run /Users/ec2-user/actions-runner/_work/testdriver/testdriver/.testdriver/test-1.yml
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"

  test2:
    name: "TestDriver Test 2"
    runs-on: ubuntu-latest
    steps:
      - uses: dashcamio/testdriver@main
        version: v4.0.0
        key: ${{secrets.TESTDRIVER_API_KEY}}
        with:
          prompt: |
            1. /run /Users/ec2-user/actions-runner/_work/testdriver/testdriver/.testdriver/test-2.yml
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"

Here's an example of testing a matrix of files.

name: TestDriver.ai / Run / Regressions

on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

jobs:
  gather-test-files:
    name: Setup Test Matrix (./testdriver/*.yml)
    runs-on: ubuntu-latest
    outputs:
      test_files: ${{ steps.test_list.outputs.files }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v2
        with:
          ref: ${{ github.event.ref }}
      - name: Find all test files and extract filenames
        id: test_list
        run: |
          FILES=$(ls ./testdriver/*.yml)
          FILENAMES=$(basename -a $FILES)
          FILES_JSON=$(echo "$FILENAMES" | jq -R -s -c 'split("\n")[:-1]')
          echo "::set-output name=files::$FILES_JSON"

  test:
    needs: gather-test-files
    runs-on: ubuntu-latest
    strategy:
      matrix:
        test: ${{ fromJson(needs.gather-test-files.outputs.test_files) }}
      fail-fast: false
    name: ${{ matrix.test }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v2
        with:
          ref: ${{ github.event.ref }}

      - name: Display filename being tested
        run: |
          echo "Running job for file: ${{ matrix.test }}"
      
      - uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: 1. /run testdriver/${{ matrix.test }} 
          prerun: |
            cd $env:TEMP
            npm init -y
            npm install dashcam-chrome
            Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized", "--load-extension=$(pwd)/node_modules/dashcam-chrome/build", "${{ env.WEBSITE_URL }}"
            exit
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
          
Schedule an onboarding call now
enterprise test dashboards
Book an Onboarding
standard rates

Importing Tests

Automatically populate tests from a defined directory into the workflow

This workflow creates a matrix of tests dynamically based off of the YAML files available in a directory (in this case, and most, the /testdriver directory)

This workflow:

  • Runs on each push to the "main" branch & every day at midnight

  • Dynamically gathers .yml files from the /testdriver directory to create a matrix of tests

  • Runs the matrix of tests in parallel

    • Each test in the matrix will download the Arc browser installer

name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:
  schedule:
    - cron: "0 0 * * *" 

jobs:
  gather-test-files:
    name: Gather Test Files
    runs-on: ubuntu-latest
    outputs:
      test_files: ${{ steps.test_list.outputs.files }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v2

      - name: Find all test files and extract filenames
        id: test_list
        run: |
          FILES=$(ls ./testdriver/*.yml)
          FILENAMES=$(basename -a $FILES)
          FILES_JSON=$(echo "$FILENAMES" | jq -R -s -c 'split("\n")[:-1]')
          echo "::set-output name=files::$FILES_JSON"

  test:
    needs: gather-test-files
    runs-on: ubuntu-latest
    strategy:
      matrix:
        test: ${{ fromJson(needs.gather-test-files.outputs.test_files) }}
      fail-fast: false
    name: ${{ matrix.test }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v2

      - name: Display filename being tested
        run: |
          echo "Running job for file: ${{ matrix.test }}"
      
      - uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: | 
            1. /run testdriver/${{ matrix.test }} 
          prerun: |
            cd $env:TEMP
            npm init -y
            npm install dashcam-chrome
            Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized", "--load-extension=$(pwd)/node_modules/dashcam-chrome/build", "${{ env.WEBSITE_URL }}"
            exit
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"
          WEBSITE_URL: "example.com"
Cover

> Click on 'options' and select 'edit'

Cover

> click on options

Cover

> select edit

Cover

> click on the "new task icon"

Cover

> click on the "plus icon"

Your desktop should look like this

run

Embeds and runs another file within the current script. This is useful for reusing common sequences of commands or modularizing your scripts.

Argument
Type
Description

file

string

The path to the file to embed and run. Should be relative to the root of your git repo.

Example Usage

command: run
file: path/to/another-script.yaml

hover-image

Move the mouse to an image matching a description. This can also handle clicking.

Argument
Type
Description

description

string

A description of the image and what it represents. Do not include the image itself here.

action

string

The action to take when the image is found. Available actions are: click, right-click, double-click, hover.

Example Usage

command: hover-image
description: search icon in the webpage content
action: click

assert

Asserts that the expectation is true using vision.

Argument
Type
Description

expect

string

The condition to check. This should be a string that describes what you see on the screen.

async

boolean

Should we continue without waiting for assertion to pass? async assertions will still cause test failures. Default is false

Example Usage

command: assert
expect: the video is playing

match-image

Finds an image on the screen and performs an action (e.g., click or hover) at its center.

This command is useful for interacting with elements that the AI has trouble locating. The testdriverai package will take a screenshot of the desktop and search for the location of the image within the screenshot.

Screenshot should be stored in testdriver/screenshots/(mac/linux/windows)/PATH.png . TestDriver will dynamically resolve images based on the current platform.

The screenshot template matching logic looks for the most similar image within the screenshot and not exact matches. If the match is not above ~80%, it will search additional scales. Otherwise it fails

To create screenshots from remote tests, download the video of the test and open it "full" or "actual" size within your computer. Then use a screenshot tool like Cleanshot X to create a screenshot of the target element. Do the best you can to center the clickable element within the screenshot.

Argument
Type
Description

path

string

The path to the image file that needs to be matched on the screen. Do not include testdriver/screenshots/*/

action

string

The action to perform when the image is found. Available actions are: click or hover.The AI will click the center of the image.

Example Usage

command: match-image
relativePath: button.png
action: click

press-keys

Types a keyboard combination.

Argument
Type
Description

keys

yml array of strings

A list of keys to press together.

Example Usage

command: press-keys
keys: [command, space]

type

Types a string using keyboard emulation.

Argument
Type
Description

string

string

The text string to type.

Example Usage

command: type
text: Hello World

if

If the condition is true, runs the commands in the block. Otherwise, runs the commands in the else block.

Argument
Type
Description

condition

string

The condition to evaluate.

then

list of commands

The commands to run if the condition is true.

else

list of commands

The commands to run if the condition is false.

Example Usage

command: if
condition: the active window is "Google Chrome"
then:
  - command: hover-text
    text: Search Google or type a URL
    description: main google search
    action: click
  - command: type
    text: monster trucks
    description: search for monster trucks
else:
  - command: focus-application
    name: Google Chrome

remember

Remembers a string value for later use.

Values are only remembered for a single session

Argument
Type
Description

description

string

The key of the memory value to store.

value

string

The value of the memory to store.

Example Usage

command: remember
description: My dog's name
value: Roofus

hover-text

Hovers over text matching the description. This can also handle clicking text.

Argument
Type
Description

text

string

The text to find on the screen. The longer and more unique, the better.

description

string

A description of the text and what it represents. The actual text itself should not be included here.

action

string

The action to take when the text is found. Available actions are: click, right-click, double-click, hover.

method

enum

The matching algorithm to use. Possible values are turbo (default) and ai.

Example Usage

command: hover-text
text: Sign Up
description: link in the header
action: click

Interactive Commands

Commands you can run in our interactive terminal

These are a list of commands you can run when you see the > after running testdriverai . Click any of the subpages to learn more. We recommend reading them in this order:

  1. Prompting

  2. /assert

  3. /undo

  4. /save

  5. /run

  6. /generate

Parallel Testing

Test multiple URLs in parallel

This workflow uses parallel testing to test several URLs in a short amount of time

This workflow:

  • Runs on each push to the "main" branch & every day at midnight

  • Defines a matrix of tests to be ran on individual, non-sequential, ephemeral VMs

  • Runs the matrix of tests

    • Each test in the matrix will have Dashcam installed, and Chrome opened to the respective URL defined in the URL value in the matrix

name: TestDriver.ai

permissions:
  actions: read
  contents: read
  statuses: write
  pull-requests: write

on:
  push:
    branches: ["main"]
  pull_request:
  workflow_dispatch:
  schedule:
    - cron: "0 0 * * *"

jobs:
  testdriver_matrix:
    name: "TestDriver Matrix"
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        test:
          - name: "Functions testdriver"
            file: "testdriver/functions.yml"
            url: "https://example.com"
          - name: "Chat testdriver"
            file: "testdriver/chat_test.yml"
            url: "https://www.exmaple.com/demos/chat/"
          - name: "P2P dynamic payment testdriver"
            file: "testdriver/p2p_dynamic_payment.yml"
            url: "https://www.exmaple.com/demos/fintech/"
          - name: "Dynamic matchmaking testdriver"
            file: "testdriver/skill_based_matchmaking.yml"
            url: "https://www.example.com/demos/skill-based-matchmaking-dashboard/"
          - name: "Support search testdriver"
            file: "testdriver/support_search.yml"
            url: "https://support.example.com/hc/en-us"
          - name: "Ask AI testdriver"
            file: "testdriver/ask_ai.yml"
            url: "https://www.example.com"
          - name: "Edit settings testdriver"
            file: "testdriver/edit_account_settings.yml"
            url: "https://www.example.com"
          - name: "Debug console testdriver"
            file: "testdriver/debug_console.yml"
            url: "https://www.example.com"
    steps:
      - uses: testdriverai/action@main
        with:
          key: ${{ secrets.TESTDRIVER_API_KEY }}
          prompt: |
            1. /run ${{ matrix.test.file }}
          prerun: |
            cd $env:TEMP
            npm init -y
            npm install dashcam-chrome
            Start-Process "C:/Program Files/Google/Chrome/Application/chrome.exe" -ArgumentList "--start-maximized", "--load-extension=$(pwd)/node_modules/dashcam-chrome/build", "${{ matrix.test.url }}"
            exit
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FORCE_COLOR: "3"

scroll-until-image

Scrolls until the specified image is found.

Arguments

Argument
Type
Description

description

string

A description of the image and what it represents.

direction

string

Available directions are: up, down, left, right.

distance

number

How many pixels to scroll before giving up. Default is 1200

Example Usage

command: scroll-until-image
description: Submit at the bottom of the form
direction: down

scroll

Scrolls up or down using the mouse wheel.

Argument
Type
Description

direction

string

Available directions are: up, down, left, right.

amount

number

Number of pixels to scroll

Example Usage

command: scroll
direction: down

/generate

/generatewill look at the display and generate lists of exploratory prompts that can be used to generate tests. Each exploratory test looks like a simple markdown file with a list inside.

Check out this test-search-function.mdfor example:

1. Click on the search icon.
2. Type "real-time chat" into the search bar.
3. Assert that search results are relevant and displayed.

TestDriver will generate 10 of these files every time the /generatecommand is called. The files will be stored in ./testdriver/generate/*.md.

Generate is a experimental feature that instructs testdriverai to come up with it's own ! The /generatecommand is used in our Generate a Test Suite demo.

The best way to use the /generatecommand is to generate regression tests using .

Every test will be run in , and can be merged into a regression test!

prompts
our GitHub action
parallel
the resulting YML

wait

Waits for a specified number of milliseconds before continuing.

Arguments

Argument
Type
Description

timeout

number

The duration in milliseconds to wait.

Example Usage

command: wait
timeout: 5000

CLI

  • testdriverai [file]

  • testdriverai init

  • testdriverai run [file]

/undo

Remove the last generated command.

If Testdriver doesn't do what you expect, you can easily remove newly generated commands with the /undo. You can undo as many times as you like.

For example given the following test:

Calling /undo will cause the last part to be undone and look like this:

- step: 
    - command: scroll-until-text
      text: Add to cart
- step: 
    - command: hover-text
      text: Add to cart
      action: click
- step: 
    - command: scroll-until-text
      text: Add to cart

testdriverai init

Set up a local project.

Run the initcommand to trigger the testdriverai interactive setup.

testdriverai init

TestDriver will walk you through .envcustomization and clone sample workflow files to deploy tests. See GitHub Actions.

Spawning GUI...
Howdy! I'm TestDriver v4.2.7
Working on /Users/ianjennings/demo-setup/testdriver/testdriver.yml

This is beta software!
Join our Discord for help
https://discord.com/invite/cWDFW8DzPm

Warning! TestDriver sends screenshots of the desktop to our API.
https://docs.testdriver.ai/security-and-privacy/agent

Welcome to the Testdriver Setup!

This is a preview of the Testdriver.ai
Please report any issues in our Discord server:
https://discord.com/invite/cWDFW8DzPm

Beginning setup...

✔ Enable desktop notifications? … yes
✔ Minimize terminal app? … yes
✔ Enable text to speech narration? … yes
✔ Send anonymous analytics? … yes
✔ Where should we append these values? … .env

Writing .env...

Downloading latest workflow files...

Writing .github
Writing .github/workflows
Writing .github/workflows/testdriver.yml

Testdriver setup complete!

Create a new test by running:
testdriverai testdriver/test.yml

Agent

While open source, the TestDriver agent does send data to remote machines.

Source

The TestDriver agent is open-source and available on NPM. You can browser the source and see all the data collected and how everything works.

API

The TestDriver agent does not contain any AI models within it. Instead, it uploads desktop context to our API which uses that context to make decisions about what actions to perform.

Desktop Context Collected

During execution the TestDriver agent uploads the following information to our API

  • User input prompts

  • The active window and other windows that may be open (including application and window titles)

  • System information

  • The mouse position

  • Screenshots of the desktop

With the exception of desktop screenshots, desktop context is persisted into our database.

Desktop screenshots are uploaded to our server but are not persisted in our database.

Desktop Screenshots

TestDriver frequently takes screenshots of the desktop to provide our AI with decisions making context. You will not be prompted. Desktop screenshots are uploaded to our API for processing but are not persisted.

The TestDriver Agent will only take screenshots of the primary display. For complete privacy, we recommend running TestDriver within a virtual machine on your desktop.

TestDriver can not operate without visual context. Do not install TestDriver if you do not want to capture images of the desktop.

Active Window

System Information

User Prompts

The prompts you input to TestDriver are uploaded to our API and persisted in a database. We store this data to provide our AI with a history of context.

Additional Analytics

When running testdriver init you'll be asked if you'd like to share additional analytics. Sharing usage analytics is opt-in, this extra data will not be collected unless explicitly set in your environment.

If you would like to disable additional analytics, you can set TD_ANALYTICS within your environment.

TD_ANALYTICS=false

Rate Limiting and Other Restrictions

While the TestDriver Agent is free, we do reserve the right to rate limit or restrict usage by IP address for any reason.

Our API makes use of OpenAI models behind the scenes. You can learn more about OpenAI and privacy in .

Information about the open windows on the desktop is reported by the module.

Information about the computer system running testdriver is reported by the module.

their privacy center
active-window
systeminformation

wait-for-text

Waits until the specified text is seen on the screen.

Argument
Type
Description

text

string

The text to find on the screen.

timeout

number

How many milliseconds to wait for text to appear. Default is 5000

method

enum

The matching algorithm to use. Possible values are ai (default) and turbo.

Example Usage

command: wait-for-text
text: Copyright 2024

/assert

Ensure some criteria is true within your test.

Use the assert to command generate an assertion. This will take a screenshot and use it to identify some criteria that ensures the task was complete.

assert No error message is displayed

This will "assert" that there's no error message, just like a user would see. The generated command will look like this:

- command: assert
  expect: There is no erorr message

When TestDriver runs this test, it will look at the screen and verify that the value of expect is true. If it is not true, the test will fail and exit immediately.

Many asserts can slow down a test. Use async: true to speed things up.

- command: assert
  expect: There is no erorr message
  async: true

Action

The TestDriver action is open source. You can find the source below.

Ephemeral Virtual Machine Runners

TestDriver tests are executed on private virtual machines managed by Amazon EC2. These VMs are ephemeral meaning they only exist for the lifetime of the test execution. After the test completes, the VM is destroyed and the hard disk is wiped.

Secrets

Any secrets supplied within Prerun Scripts or prompts will be transmitted over SSL to our API. Secrets supplied to agent prompts will be persisted (see Agent), but prerun scripts are not persisted.

If your workflow is blocked by secret sharing, please contact us.

A common workflow is to use Prerun Scripts to access a private staging website via basic auth. This will allow you to securely log into staging without persisting sensitive data on our servers.

Production

Testing production is the best starting point!

Testing production resources does not require any private information from your team. Simply provide the tests to TestDriver and point toward your publicly available endpoints. TestDriver does not need access to any private information.

Staging

Depending on your implementation, TestDriver may need secure information to test staging environments. See Secretsabove for more information on securely implementing tests on staging.

Development

TestDriver can clone feature branches and build code on it's virtual machines using a similar workflow to GitHub runners.

In order for TestDriver to test development branches of private codebases, it's necessary to supply a GitHub Token within the GitHub action. This personal access token is used to clone the codebase on the VM.

env:
  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

We recommend storing any private information as a secret in your GitHub repository. .

This token is transmitted over SSL and is not persisted.

Learn more about storing secrets here
Learn more about managing the privacy of GitHub access tokens here.

wait-for-image

Waits until the specified image is seen on the screen.

Argument
Type
Description

description

string

A description of the image.

timeout

number

How many milliseconds to wait for image to appear. Default is 5000

Example Usage

yamlCopy codecommand: wait-for-image
description: trash icon

/save

Saves the current state of the test script to a file.

This command generates a YAML file with the history of executed commands and tasks.

> /save

  saving...

  Current test script:

  version: 4.0.0
  steps:
    - prompt: navigate to fiber.google.com
      commands:
        - command: focus-application
          name: Google Chrome
        - command: hover-text
          text: Search Google or type a URL
          description: main google search
          action: click
        - command: type
          text: fiber.google.com
        - command: press-keys
          keys:
            - enter

scroll-until-text

Scrolls until the specified text is found on screen.

Example Usage

Argument
Type
Description

text

string

The text to find on screen. The longer and more unique, the better.

direction

string

Available directions are: up, down, left, right.

distance

number

How many pixels to scroll before giving up. Default is 1200

method

enum

The matching algorithm to use. Possible values are ai (default) and turbo.

command: scroll-until-text
text: Sign Up
direction: down

/run

Run a test from a file.

To run a test you've previously created, use the /run command.

testdriverai
> /run helloworld.yml

TestDriver will run the test plan back performing each command.

This command will exit the program upon execution. Any failures will be output and the program will exit with code 1.

Dashboard

Security and privacy in the TestDriver web UI.

Dashcam and TestDriver share the same API and web application back end. This web application includes the following privacy and security features:

Feature
Description

SSL

All data is transmitted over HTTPS

OAuth

Users may only authenticate via OAuth provided by Auth0

Team Management

Individual team members may be added or removed by administrators only.

Roll Based Access Control

The first user to create a team is the administrator. Administrators are the only users who can see the API key and manage team settings. Administrators can not be removed. All other users are normal members.

API Key Rotation

The team API key can be rotated. We recommend rotating your API key every 90 days.

Secret Masking

Test replay logs and network requests are parsed for secrets like credit card numbers, emails, passwords, and other keys. Found secrets are overwritten with asterisks ****

Encrypted At Rest

Test replays and logs are stored securely on Amazon S3 and encrypted at rest. Test results are are only available via temporary signed urls. Signed URLs are only generated for team users.

Tests that execute via our GitHub Action are recorded and reported via (another application developed by TestDriver). You can find more information on .

Dashcam
the Dashcam docs
LogoGitHub - testdriverai/actionGitHub

testdriverai run [file]

Run a test file.

Runs the specified file.

The command will return exit code 0if the test is successful and 1if a failure.

Screen Recording Permissions (Mac Only)

If you're on mac, you might see an error related to "screen capture permissions" when getting started. This means that the application invoking the `testdriverai` command does not have permission to capture.

You must enable screen capture permissions for this application

This might be a little confusing at first. It's not testdriverai that needs the permissions, it's the terminal or IDE that's calling testdriverai that needs them. This is probably:

  • Mac Terminal

  • iTerm

  • VS Code

  • Or something similar

Enable Screen Capture Permissions for TestDriver

To enable screen capture permissions, do the following.

  • Open System Settings (or System Preferences on older versions).

  • Navigate to Privacy & Security > Screen Recording.

  • Find and enable permissions for Terminal (or any terminal-based app like iTerm2, if you're using it).

  • If the desired application is unavailable in the list, click on "+" symbol below, you will get a finder pop up, here find the app that you are currently using under Application section and click on open.

  • You may need to restart the terminal for the changes to take effect.

Privacy & Security > Screen & System Audio Recording
Click the toggle button to enable screen capture permissions.

testdriverai [file]

Launch the interactive agent

This is the core testdriver command that will launch the testdriver agent.

testdriverai

Supply a file command to specify output location

testdriverai path/to/file.yml

This defaults to testdriver/testdriver.yml