ywinappdriver

Date: Mar 1, 2021

Keywords: dotnet core, wdio, testing, winappdriver

GitHub: https://github.com/licanhua/ywinappdriver

Overview

Yet Another WinAppDriver (YWinAppDriver)

Microsoft WinAppDriver is the official application to support Selenium-like UI Test Automation on Windows Applications. This service supports testing Universal Windows Platform (UWP), Windows Forms (WinForms), Windows Presentation Foundation (WPF), and Classic Windows (Win32) apps on Windows 10 PCs.

This repo is an open source asp.net core implementation of WinAppDriver and it's compatible with Microsoft's WinAppDriver. Most of the ideas are coming from the test infrastructure of WinUI, Microsoft.Windows.Apps.Test, and WinAppDriver document. I combined them and come up with the open source implementation.

I name this project YWinAppDriver(yet another WinAppDriver).

Build Status

Project Status: 0.2.x

Most of the functionalities are ready and you should be able to switch from WinAppDriver to YWinAppDriver without any(or With little) change. SessionController.cs defines all the endpoints it supported.

For the XPath syntax, refer to https://docs.microsoft.com/en-us/previous-versions/dotnet/netframework-4.0/ms256086(v=vs.100)

Download & Run YWinAppDriver

  • Download and compile

There are two ways to get the WinAppDriver.exe:

  1. Clone this repo, then open the WinAppDriver.sln and build WinAppDriver project.
  2. or Download it from https://github.com/licanhua/YWinAppDriver/releases

By default, YWinAppDriver is http://127.0.0.1:4723. You can change the port number and basepath easily:

  1. CLI

run WinAppDriver.exe --urls http://127.0.0.1:4723 --basepath /wd/hub

A complete command line:

WinAppDriver.exe --urls http://127.0.0.1:4723 --basepath /wd/hub --logpath logs
  1. From Visual Studio

There are two settings are ready for you. IIS Express /wd/hub is http://127.0.0.1:4723/wd/hub

  1. Using appsettings.json
"Urls": "http://127.0.0.1:4723",
"Basepath": "/wd/hub"
  • Build and run the CalcatorTest in examples Please run the test, please make sure Calculator is in Standard mode.

nodejs YWinAppDriver/WinAppDriver examples:

If you are authoring the test case with Jest, Jasmine or any other JavaScript framework, you can switch between YWinAppDriver and WinAppDriver very easily.

wdio + YWinAppDriver/WinAppDriver

selenium + YWinAppDriver/WinAppDriver

Knowledge for contributors

  1. Asp.Net Core

This project is developed with Asp.Net Core and referred 3.1

  1. Microsoft.Windows.Apps.Test

In Windows, UIAutomation is the technology which test driver could use to manipulate the UI.

Microsoft.Windows.Apps.Test is the core UI Automation library which is spinned from WinAppDriver, which allows user to interact with the testapp outside of WinAppDriver. WinUI is the first public user which implements its own automation without WinAppDriver.

Microsoft.Windows.Apps.Test has the public nuget and is binary open sourced. YWinAppDriver is based on Microsoft.Windows.Apps.Test too and used this library to interact with the testapp.

Microsoft.Windows.Apps.Test documentation can be found here: Microsoft.Windows.Apps.Test.chm.

  1. Protocols There are two protocols: w3c webdriver and selenium JsonWire Protocol JsonWire is obselete, but still there a lot of client/server doen't support w3c specification. So YWinAppDriver is trying to match with both protocols

  2. Locators and capalibilites are the same/nearly the same with WinAppDriver.

API Supported

Below is the table to match with selenium json wire protocol. Because YWinAppDriver is for desktop application other than browser, no below means it's not supported. maybe means it's possible to support it, but I didn't see any value to support it.

Dev StatusHTTP MethodPathSummary
completedGET/statusQuery the server's current status.
completedPOST/sessionCreate a new session.
completedGET/sessionsReturns a list of the currently active sessions.
completedGET/session/:sessionIdRetrieve the capabilities of the specified session.
completedDELETE/session/:sessionIdDelete the session.
completedPOST/session/:sessionId/timeoutsConfigure the amount of time that a particular type of operation can execute for before they are aborted and a
noPOST/session/:sessionId/timeouts/async_scriptSet the amount of time, in milliseconds, that asynchronous scripts executed by /session/:sessionId/execute_async are permitted to run before they are aborted and a
completedPOST/session/:sessionId/timeouts/implicit_waitSet the amount of time the driver should wait when searching for elements.
completedGET/session/:sessionId/window_handleRetrieve the current window handle.
completedGET/session/:sessionId/window_handlesRetrieve the list of all window handles available to the session.
noGET/session/:sessionId/urlRetrieve the URL of the current page.
noPOST/session/:sessionId/urlNavigate to a new URL.
maybePOST/session/:sessionId/forwardNavigate forwards in the browser history, if possible.
maybePOST/session/:sessionId/backNavigate backwards in the browser history, if possible.
maybePOST/session/:sessionId/refreshRefresh the current page.
maybePOST/session/:sessionId/executeInject a snippet of JavaScript into the page for execution in the context of the currently selected frame.
maybePOST/session/:sessionId/execute_asyncInject a snippet of JavaScript into the page for execution in the context of the currently selected frame.
completeGET/session/:sessionId/screenshotTake a screenshot of the current page
completeGET/session/:sessionId/element/:elementId/screenshotTake a screenshot of the element
noGET/session/:sessionId/ime/available_enginesList all available engines on the machine.
noGET/session/:sessionId/ime/active_engineGet the name of the active IME engine.
noGET/session/:sessionId/ime/activatedIndicates whether IME input is active at the moment (not if it's available.
noPOST/session/:sessionId/ime/deactivateDe-activates the currently-active IME engine.
noPOST/session/:sessionId/ime/activateMake an engines that is available (appears on the listreturned by getAvailableEngines) active.
noPOST/session/:sessionId/frameChange focus to another frame on the page.
noPOST/session/:sessionId/frame/parentChange focus to the parent context.
completedPOST/session/:sessionId/windowChange focus to another window.
completedDELETE/session/:sessionId/windowClose the current window.
completedPOST/session/:sessionId/window/:windowHandle/sizeChange the size of the specified window.
completedGET/session/:sessionId/window/:windowHandle/sizeGet the size of the specified window.
completedPOST/session/:sessionId/window/:windowHandle/positionChange the position of the specified window.
completedGET/session/:sessionId/window/:windowHandle/positionGet the position of the specified window.
completedPOST/session/:sessionId/window/:windowHandle/maximizeMaximize the specified window if not already maximized.
noGET/session/:sessionId/cookieRetrieve all cookies visible to the current page.
noPOST/session/:sessionId/cookieSet a cookie.
noDELETE/session/:sessionId/cookieDelete all cookies visible to the current page.
noDELETE/session/:sessionId/cookie/:nameDelete the cookie with the given name.
completedGET/session/:sessionId/sourceGet the current page source.
completedGET/session/:sessionId/titleGet the current page title.
completedPOST/session/:sessionId/elementSearch for an element on the page, starting from the document root.
completedPOST/session/:sessionId/elementsSearch for multiple elements on the page, starting from the document root.
completedPOST/session/:sessionId/element/activeGet the element on the page that currently has focus.
completedGET/session/:sessionId/element/:idDescribe the identified element.
completedPOST/session/:sessionId/element/:id/elementSearch for an element on the page, starting from the identified element.
completedPOST/session/:sessionId/element/:id/elementsSearch for multiple elements on the page, starting from the identified element.
completedPOST/session/:sessionId/element/:id/clickClick on an element.
noPOST/session/:sessionId/element/:id/submitSubmit a FORM element.
completedGET/session/:sessionId/element/:id/textReturns the visible text for the element.
completedPOST/session/:sessionId/element/:id/valueSend a sequence of key strokes to an element.
completedPOST/session/:sessionId/keysSend a sequence of key strokes to the active element.
complentedGET/session/:sessionId/element/:id/nameQuery for an element's tag name.
completedPOST/session/:sessionId/element/:id/clearClear a TEXTAREA or text INPUT element's value.
completedGET/session/:sessionId/element/:id/selectedDetermine if an OPTION element, or an INPUT element of type checkbox or radiobutton is currently selected.
completedGET/session/:sessionId/element/:id/enabledDetermine if an element is currently enabled.
completedGET/session/:sessionId/element/:id/attribute/:nameGet the value of an element's attribute.
completedGET/session/:sessionId/element/:id/equals/:otherTest if two element IDs refer to the same DOM element.
completedGET/session/:sessionId/element/:id/displayedDetermine if an element is currently displayed.
completedGET/session/:sessionId/element/:id/locationDetermine an element's location on the page.
maybeGET/session/:sessionId/element/:id/location_in_viewDetermine an element's location on the screen once it has been scrolled into view.
completedGET/session/:sessionId/element/:id/sizeDetermine an element's size in pixels.
noGET/session/:sessionId/element/:id/css/:propertyNameQuery the value of an element's computed CSS property.
noGET/session/:sessionId/orientationGet the current browser orientation.
noPOST/session/:sessionId/orientationSet the browser orientation.
noGET/session/:sessionId/alert_textGets the text of the currently displayed JavaScript alert(), confirm(), or prompt() dialog.
noPOST/session/:sessionId/alert_textSends keystrokes to a JavaScript prompt() dialog.
noPOST/session/:sessionId/accept_alertAccepts the currently displayed alert dialog.
noPOST/session/:sessionId/dismiss_alertDismisses the currently displayed alert dialog.
completedPOST/session/:sessionId/movetoMove the mouse by an offset of the specificed element.
completedPOST/session/:sessionId/clickClick any mouse button (at the coordinates set by the last moveto command).
completedPOST/session/:sessionId/buttondownClick and hold the left mouse button (at the coordinates set by the last moveto command).
completedPOST/session/:sessionId/buttonupReleases the mouse button previously held (where the mouse is currently at).
completedPOST/session/:sessionId/doubleclickDouble-clicks at the current mouse coordinates (set by moveto).
completedPOST/session/:sessionId/touch/clickSingle tap on the touch enabled device.
completedPOST/session/:sessionId/touch/downFinger down on the screen.
completedPOST/session/:sessionId/touch/upFinger up on the screen.
completedPOSTsession/:sessionId/touch/moveFinger move on the screen.
in progressPOSTsession/:sessionId/touch/scrollScroll on the touch screen using finger based motion events.
in progressPOSTsession/:sessionId/touch/scrollScroll on the touch screen using finger based motion events.
completedPOSTsession/:sessionId/touch/doubleclickDouble tap on the touch screen using finger motion events.
completedPOSTsession/:sessionId/touch/longclickLong press on the touch screen using finger motion events.
in progressPOSTsession/:sessionId/touch/flickFlick on the touch screen using finger motion events.
in progressPOSTsession/:sessionId/touch/flickFlick on the touch screen using finger motion events.
noGET/session/:sessionId/locationGet the current geo location.
noPOST/session/:sessionId/locationSet the current geo location.
noGET/session/:sessionId/local_storageGet all keys of the storage.
noPOST/session/:sessionId/local_storageSet the storage item for the given key.
noDELETE/session/:sessionId/local_storageClear the storage.
noGET/session/:sessionId/local_storage/key/:keyGet the storage item for the given key.
noDELETE/session/:sessionId/local_storage/key/:keyRemove the storage item for the given key.
noGET/session/:sessionId/local_storage/sizeGet the number of items in the storage.
noGET/session/:sessionId/session_storageGet all keys of the storage.
noPOST/session/:sessionId/session_storageSet the storage item for the given key.
noDELETE/session/:sessionId/session_storageClear the storage.
noGET/session/:sessionId/session_storage/key/:keyGet the storage item for the given key.
noDELETE/session/:sessionId/session_storage/key/:keyRemove the storage item for the given key.
noGET/session/:sessionId/session_storage/sizeGet the number of items in the storage.
noPOST/session/:sessionId/logGet the log for a given log type.
noGET/session/:sessionId/log/typesGet available log types.
noGET/session/:sessionId/application_cache/statusGet the status of the html5 application cache.

Background

One week ago, another team reached to me to ask some advice to help them choose the UI automation tool. It makes me think: Although WinAppDriver is the de fact tool recommended by Microsoft, Is WinAppDriver the right tool for everybody? I didn't see other open source option yet, so I spent one weekend to create the 0.1 release.

Two years ago, hassanuz was an active contributor on WinAppDriver and he started to promote the WinAppDriver usage in Microsoft. I was working on WinUI, and WinUI used Microsoft.Windows.Apps.Test. So we sit down together to see if I can adopt iWinAppDriver. After the conversation, I found we hit the White box and Gray box dilemma. WinUI is a gray box testing while WinAppDriver is blackbox testing.

WinUI provides a lot of amazing features in its test infrastucture which WinAppDriver doesn't support:

  • Wait.ForIdle is the killer feature to make the testing stable. When WinAppDriver saw the element, it doesn't mean UI is ready for interaction. Wait.ForIdle lets you know UI is ready to take user's input, so you will not run into the unstable situation that automation test case clicked the button in test pipeine, but there is no response, and I never reproduce the problem locally.
  • Dump the visual tree when test failed
  • Restart the application when launching failed, and kill the application when there is exception in test
  • Pan, drag, scroll, and gamepad support.
  • Speed. WinUI has thousand of test cases, and I want it be finished as soon as possbile.

To adopt WinAppDriver, we need to resolve these problems first. Then we introduced the plugin mode into WinAppDriver> User can build their own business logic to the plugin, the WinAppDriver would load it in start up. So for WinUI, I can move the Infra part into the plugin. For native application, ExecuteScript is not used, so we can re-use it without any impact to the existing selenium clients. So the message flow look like this:

Selenium Client ExecuteScript-> WinAppDriver -> WinUI Plugin -> UIA - TestApp

We finished the prototype. Because the legal concern and there is no business value from leader's aspect, we didn't make it into the end user.

I think YWinAppDriver is able to address above problems, and possible make every body happy.

Supported Locators to Find UI Elements

Windows Application Driver supports various locators to find UI element in the application session. The table below shows all supported locator strategies with their corresponding UI element attributes shown in inspect.exe.

Client APILocator StrategyMatched Attribute in inspect.exeExample
FindElementByAccessibilityIdaccessibility idAutomationIdAppNameTitle
FindElementByClassNameclass nameClassNameTextBlock
FindElementByIdidRuntimeId (decimal)42.333896.3.1
FindElementByNamenameNameCalculator
FindElementByTagNametag nameLocalizedControlType (upper camel case)Text
FindElementByXPathxpathAny//Button[0]

Supported Capabilities

Below are the capabilities that can be used to create Windows Application Driver session.

CapabilitiesDescriptionsExample
appApplication identifier or executable full pathMicrosoft.MicrosoftEdge_8wekyb3d8bbwe!MicrosoftEdge
appArgumentsApplication launch argumentshttps://github.com/Microsoft/WinAppDriver
attachToTopLevelWindowClassNameapp should be "Root", Existing application top level window to attach to. if you are using WinAppDriver, please use appTopLevelWindow0xB822E2
appWorkingDirApplication working directory (Classic apps only)C:\Temp
forceMatchAppTitleIf app is launched, but have problem to match it, YWinAppDriver do the last try to match with the application titleCalculator
forceMatchClassNameIf app is launched, but have problem to match it, YWinAppDriver do the last try to match with the class nameChrome_WidgetWin_1
clickWithInvokeUse Inovke other than click to get better performancetrue/false

YWinAppDriver addressed some WinAppDriver 1.2 issues and fixed them

WinAppDriver has problem to start c:\windows\system32\calc.exe, PostMan, or Notepad++ issue 1372

In YWinAppDriver, you can workaround the problem with the capabilites like below.

"forceMatchAppTitle": "Calculator"
"forceMatchClassName":"Notepad++"

Appium desktop can't show all controls WinAppDriver provided

Appium Desktop is a great tool to inspect the app's elements and it's very easy to learn and use it. Currently WinAppDriver can't show all elements. I don't know if it's the issue of WinAppDriver or Appium, but YWinAppDriver doesn't have this problem.

YWinAppDriver Known issue

The Click is slow than WinAppDriver

You can set capabilities {clickWithInvoke: true}, it will be very fast. The down side is it will not close the flyout like Menu Windows until you click somewhere else.