Introduction

TestDriver is a next generation autonomous AI agent for end-to-end testing of web & desktop

TestDriver isn't like any test framework you've used before - it's more like your own QA employee with their own development environment.

  1. Tell TestDriver what to do in natural language

  2. TestDriver looks at the screen and uses mouse and keyboard emulation to accomplish the goal

TestDriver is black-box testing. It doesn't use selectors or static analysis.

Advantages

TestDriver then uses AI vision and hardware emulation to simulate real user on their own computer. This has three main advantages:

  • Easier set up: No need to add test IDs or craft complex selectors

  • Less Maintenance: Tests don't break when code changes

  • More Power: TestDriver can test any application and control any OS setting

Just tell TestDriver what to do

Use our CLI to tell TestDriver what to do, like so:

> open google chrome
> navigate to airbnb.com
> search for destinations in austin tx
> click check in
> select august 8

Possibilities

As you can imagine, a specialized QA agent with it's own computer is extremely powerful. TestDriver can:

  • Test any user flow on any website in any browser

  • Clone, build, and test any desktop app

  • Render multiple browser windows and popups like 3rd party auth

  • Test <canvas>, <iframe>, and <video> tags with ease

  • Use file selectors to upload files to the browser

  • Resize the browser

  • Test chrome extensions

  • Test integrations between applications

The problem with current approach to end-to-end testing

End-to-end is commonly described as the most expensive and time-consuming test method. Right now we write end-to-end tests using complex selectors that are tightly coupled with the code.

You've probably seen selectors like this:

const e = await page.$('div[class="product-card"] >> text="Add to Cart" >> nth=2');

This tight coupling means developers need to spend time to understand the codebase and maintain the tests every time the code changes. And code is always changing!

End-to-end is about users, not code

In end-to-end testing the business priority is usability. All that really matters is that the user can accomplish the goal.

TestDriver uses human language to define test requirements. Then our simulated software tester figures out how to accomplish those goals.

Old and Busted (Selectors)
New Hotness (TestDriver)
div[class="product-card"] >> text="Add to Cart" >> nth=2

buy the 2nd product

These high level instructions are easier to create and maintain because they are loosely coupled from the codebase. We're describing a high level goal, not a low level interaction.

The tests will still continue to work even when the junior developer changes .product-card to .product.card or the designers change Add to Cart to Buy Now . The concepts remain the same so the AI will adapt.

How exactly does this work?

TestDriver's AI is a fine tuned model developed over the course of more than a year with the help of computer vision experts, custom research tooling, and a few million dollars in funding (thanks VCs).

TestDriver uses a combination of reinforcement learning and computer vision. The context from successful text executions inform future executions. Here's an example of the context our model considers when locating a text match:

Context
What is it?
Touchpoint

Prompt

Desired outcome

User Input

Screenshot

Image of computer desktop

Runtime

OCR

All possible text found on screen

Runtime

Text Similarity

Closely matching text

Runtime

Redraw

Visual difference between previous and current desktop screenshots

Runtime

Network

The current network activity compared to a baseline

Runtime

Execution History

Previous test steps

Runtime

System Information

Platform, Display Size, etc

Runtime

Mouse Position

X, Y coordinates of mouse

Runtime

Description

An elaborate description of the target element including it's position and function

Past Execution

Text

The exact text value clicked

Past Execution

Last updated