testdriver/match-image.yaml
version: 6.0.0
steps:
  - prompt: login
    commands:
      - command: run
        file: snippets/login.yaml
  - prompt: assert the testdriver login page shows
    commands:
      - command: match-image
        path: screenshots/cart.png
        action: click
  - prompt: assert that you see an empty shopping cart
    commands:
      - command: assert
        expect: Your cart is empty

Description

The match-image command is used to locate an image on the screen by matching it with a reference image file and performing an action (For example, click or hover) at its center. This command is particularly useful for interacting with elements that the AI has difficulty locating using descriptions or other methods.

Arguments

ArgumentTypeDescription
pathstringThe path to the image that needs to be matched. The path needs to be relative to the current test file
actionstringThe action to take when the image is found. Available actions are: click, right-click, double-click, hover. Also supports drag-start and drag-end for dragging images

Example usage

command: match-image
path: screenshots/button.png
action: click

How it works

  • The match-image command takes a screenshot of the desktop and searches for the location of the reference image within the screenshot.
  • The matching logic looks for the most similar image within the screenshot, not an exact match. If the similarity is below ~80%, it will search additional scales. If no match is found, the command will fail.
  • Screenshots should be stored in the testdriver/screenshots/ directory.

Protips

  • To create high-quality screenshots for matching:
    • Download the video of the test and open it at full or actual size on your computer.
    • Use a screenshot tool (like Cleanshot X) to capture the target element.
    • Center the clickable element as much as possible within the screenshot.
  • Ensure the image file is clear and free of unnecessary visual noise to improve matching accuracy.

Gotchas

  • The screenshot image should be of the actual size of the image thatโ€™s being match against.
  • If the image match is below ~80% similarity, the program tries to match will different scales, its recommended to have an actual size image.
  • Variations in screen resolution, scaling settings, or platform-specific UI differences may affect matching accuracy.
  • Ensure the image file is stored in the correct directory structure (testdriver/screenshots/) for dynamic resolution.
  • The pathneeds to be relative to the current test fileโ€™s directory. For example, if the test file is at testdriver/onboarding/login.yaml and the image is present at the conventional testdriver/screenshots/nepal-flag.png it should be referrenced as
command: match-image
path: ../screenshots/nepal-flag.png
action: click

Notes

  • The match-image command is ideal for interacting with visual elements that canโ€™t be reliably described or located using other commands like hover-image.
  • This command supports flexible scaling to account for minor differences in image size or resolution.
  • Use this command as a fallback when other methods fail to locate the desired element.