LogoLogo
  • Overview
    • Quickstart
    • FAQ
    • Overview
    • Pricing
    • Comparison
    • Discord
  • Pro Setup
    • Book a Demo
    • 30x30 Promotion
  • Guides
    • Generate a Test Suite
    • Local Agent Setup
    • Prompting
    • Getting an API Key
    • GitHub Actions
      • GitHub Action Setup
      • Prerun Scripts
      • Environment Config
      • Parallel Testing
      • Storing Secrets
      • Optimizing Performance
      • Action Output
      • Examples
        • Test Generation
        • Parallel Testing
        • Importing Tests
        • Desktop Apps
        • Secure Log In
    • Debugging Test Runs
    • Monitoring Performance
  • Reference
    • Test Steps
      • assert
      • exec
      • focus-application
      • hover-image
      • match-image
      • hover-text
      • if
      • press-keys
      • remember
      • run
      • scroll
      • scroll-until-image
      • scroll-until-text
      • type
      • wait
      • wait-for-image
      • wait-for-text
    • Interactive Commands
      • /assert
      • /undo
      • /save
      • /run
      • /generate
    • CLI
      • testdriverai init
      • testdriverai [file]
      • testdriverai run [file]
  • Security & Privacy
    • Agent
    • Action
    • Dashboard
  • FAQ
    • Screen Recording Permissions (Mac Only)
    • Status Page
Powered by GitBook
On this page
  • Example Usage
  • Prompting Tips
  • The agent is selecting the wrong thing!
  • You're asking the AI to interact with elements it can not see
  • You're asking the AI to click on elements it does not understand
  • Describing Images Properly
  • The AI can not find small images
  • No matter what I do, TestDriver will not select my element.

Was this helpful?

Export as PDF
  1. Guides

Prompting

Executes tasks based on user input using natural language processing. This command is invoked when the user input does not start with a / command. The system interprets the input and attempts to carry out the task specified by the user.

Example Usage

> click sign up 
    
    thinking...

    To accomplish the goal of clicking "Sign Up," we need to
    focus on the Google Chrome application and then click on
    the "Sign Up" button.

    Here are the steps:

    1. Focus the Google Chrome application.
    2. Click on the "Sign Up" button.

    commands:
      - command: focus-application
        name: Google Chrome
      - command: hover-text
        text: Sign Up
        description: button in the header
        action: click

command='focus-application' name='Google Chrome'
command='hover-text' text='Sign Up' description='button in the header' action='click'

Prompting Tips

The agent is selecting the wrong thing!

This is the most common issues encountered with our agent, here are some possible reasons.

You're asking the AI to interact with elements it can not see

TestDriver uses the context from your prompt and the computer screen to make a decision of what commands to run. You should only prompt the AI to interact with elements it can currently see.

Incorrect Prompt

A common example of this is interacting with a dropdown. We often see users prompt the agent to interact with a dropdown and choose a state.

Recommended prompts

Instead, simply treat these as two separate prompts. This allows the UI to render and gives the AI the opportunity to parse the new screen data.

You're asking the AI to click on elements it does not understand

The TestDriver agent relies on visual understanding, not functional. Like any user, the AI does not understand what the function of a button will be. It can only guess.

Incorrect Prompt

Correct Prompts

Describing Images Properly

If you're uncertain of how to describe an icon, simply ask ChatGPT-4o what it would call it, and use that as your input.

The AI can not find small images

Small, isolated images smaller than 15x15px appear like "noise" to the AI and may not be clickable. However, you can use the match-imagecommand to select these using manually made screenshots.

No matter what I do, TestDriver will not select my element.

The AI has trouble selecting some specific elements, like empty gray boxes, some substrings, or conditions where there is a lot of similar text close together.

If that's the case, you can always fall back to match-image. We've seen typically a test suite of 10 tests could require a single screenshot.

PreviousLocal Agent SetupNextGetting an API Key

Last updated 1 month ago

Was this helpful?

Cover

> Click on 'options' and select 'edit'

Cover

> click on options

Cover

> select edit

Cover

> click on the "new task icon"

Cover

> click on the "plus icon"