TestDriver uses AI vision and keyboard and mouse control to automate end-to-end testing. TestDriver is selectorless meaning it isn’t aware of the underlying code structure.

Easier Setup

No need to craft complex selectors.

More Power

TestDriver can test anything a user can do.

Less Maintenance

Tests don’t break when code changes.

TestDriver is different from other computer-use agents in that it produces a YAML test script that increases the speed and repeatability of testing.

Selectorless testing

Unlike traditional frameworks (for example, Selenium, Playwright), TestDriver doesn’t rely on CSS selectors or static analysis. Instead, tests are described in plain English, such as:

> Open Google Chrome and search for "testdriver"

This means that you can write tests without worrying about the underlying code structure:

  • Test any user flow on any website in any browser
  • Clone, build, and test any desktop app
  • Render multiple browser windows and popups like 3rd party auth
  • Test <canvas>, <iframe>, and <video> tags with ease
  • Use file selectors to upload files to the browser
  • Resize the browser
  • Test chrome extensions
  • Test integrations between applications

The problem with current approach to end-to-end testing

End-to-end is commonly described as the most expensive and time-consuming test method. Right now we write end-to-end tests using complex selectors that are tightly coupled with the code.

const e = await page.$('div[class="product-card"] >> text="Add to Cart" >> nth=2');

This tight coupling means developers need to spend time to understand the codebase and maintain the tests every time the code changes. And code is always changing!

End-to-end is about users, not code

In end-to-end testing the business priority is usability. All that really matters is that the user can accomplish the goal.

TestDriver uses human language to define test requirements. Then our simulated software tester figures out how to accomplish those goals.

Old and Busted (Selectors)New Hotness (TestDriver)
div[class="product-card"] >> text="Add to Cart" >> nth=2buy the 2nd product

These high level instructions are easier to create and maintain because they’re loosely coupled from the codebase. We’re describing a high level goal, not a low level interaction.

The tests will still continue to work even when the junior developer changes .product-card to .product.card or the designers change Add to Cart to Buy Now . The concepts remain the same so the AI will adapt.

How exactly does this work?

TestDriver uses a combination of reinforcement learning and computer vision. The context from successful text executions inform future executions. Here’s an example of the context our model considers when locating a text match:

ContextWhat’s it?Touchpoint
PromptDesired outcomeUser Input
ScreenshotImage of computer desktopRuntime
OCRAll possible text found on screen Runtime
Text SimilarityClosely matching textRuntime
RedrawVisual difference between previous and current desktop screenshotsRuntime
NetworkThe current network activity compared to a baselineRuntime
Execution HistoryPrevious test stepsRuntime
System InformationPlatform, Display Size, etcRuntime
Mouse PositionX, Y coordinates of mouseRuntime
DescriptionAn elaborate description of the target element including it’s position and functionPast Execution
TextThe exact text value clickedPast Execution