a11y: the Antidote to k6 browser False Positives
About Me
Web UI Test Engineer, KBR
NASA Ames Research Center
What is Open MCT?
Red Bull Formula One
VIPER
In this talk
- What is k6 Browser?
- What can go when writing tests?
- What can we do about it?
- What is web accessibility?
- How does it solve our problems?
- k6 is an open source load testing tool with a huge ecosystem of extensions and protocol-specific testing capabilities
- k6 Browser drives real chrome browsers with playwright APIs
- JavaScript - the lingua franca of the FE
- Measurements feed into many (all?) storage backends
- Useful for UX Performance Measurements, Synthetics, generalized performance tests
Demo Time
- a11y-as-t9y repo
- Create and Search for an object
- DB and Search service
- What can go wrong
Synthetic tests aren’t very resilient, and they can easily fail when small UI changes are introduced, generating unnecessary alert noise. This means that whenever a minor application element such as a button is changed, the corresponding test must be, too."
dynatrace.com/news/blog/what-is-synthetic-monitoring/
What is a button selector?
<div class="c-create-button--w l-shell__create-button">
<button class="c-create-button c-button--menu c-button--major icon-plus">
<span id="create-button-label" class="c-button__label">Create</span>
</button>
</div>
page.locator('//*[@id="openmct-app"]/div/div[2]/div[1]/button').click();
In a nutshell, locators represent a way to find element(s) on the page at any moment.
What can we do about it?
Just update it- Find another team solving this problem... Web UI Testers!
Page Object Model
header/
├── Create Button/
│ └── createList.js
├── Search Bar/
│ └── searchOptions.js
└── Indicators/
└── Clock.js
import CreateButtonLocator from CreateButton/createList.js;
page.locator(CreateButtonLocator).click();
//CreateButton/createList.js
//
const CreateButtonLocator = "//*[@id="openmct-app"]/div/div[2]/div[1]/button"
POM Continued
- Instead of 300 changes to the testcode, let’s just do 1!
- Shared POM repo
xk6-browser-template-typescript-playwright - Alternatives. Screenplay pattern, App Actions
- Doesn’t actually solve the underlying difference. We’ll still get an alarm.
Locator/Selector "Best Practices"
- Decouple locator logic and App Layout State
- CSS Classnames?
<div class="c-create-button--w l-shell__create-button">
<button class="c-create-button c-button--menu c-button--major icon-plus"
aria-disabled="false" aria-labelledby="create-button-label">
<span id="create-button-label" class="c-button__label">Create</span>
</button>
</div>
page.locator('c-create-button.c-button__label').click()
- Classnames change too!
Locator/Selector "Best Practices"
- IDs! Unique... not always needed by FrontEnd JS
<div class="c-create-button--w l-shell__create-button">
<button class="c-create-button c-button--menu c-button--major icon-plus">
<span id="create-button-label" class="c-button__label">Create</span>
</button>
</div>
page.locator('#create-button-label').click()
- Let's just DIY
<div class="c-create-button--w l-shell__create-button">
<button class="c-create-button c-button--menu c-button--major icon-plus">
<span id="create-button-label" data-id="create-button" class="c-button__label">Create</span>
</button>
</div>
page.locator('[data-id=create-button]').click()
There must be another way
Assistive Technologies Demo
Web Accessibility (a11y)
Web accessibility ensures web applications are designed to be usable by everyone, including those with disabilities, by following guidelines that enhance perceivability, operability, understandability, and compatibility with assistive technologies.
<div class="c-create-button--w l-shell__create-button">
<button class="c-create-button c-button--menu c-button--major icon-plus"
aria-label="Create">
<span id="create-button-label" data-id="create-button" class="c-button__label">Create</span>
</button>
</div>
page.locator('[data-id=create-button]').click()
page.locator('[aria-label="Create"]').click()
aria demo
a11y only changes
- a11y is driven by user facing behavior
- When user-facing behavior changes
- angular1.7->vue2->vue3->nuxt
How to make something a11e
-
Don't focus on standards.
That's for contrast and extreme key navigation - Can I just add aria-labels everywhere?
Sorta. Aria Implies Interactivity! Which is fine? - Priority
- HTML5 Semantics like <button>
- Roles, i.e. menubar, cell
- Aria-label
- data-id
How to get started?
- Don't boil the ocean. Leave it better than you found it. A bird in the hand?
- Your functional test framework already supports it
- MDN
- ChatGPT
- patternfly
In Summary
- k6 browser can avoid the dynatrace trap
- a11y can replace most test-specific code
- a11y is the antidote to flake
CUTTING ROOM FLOOR
Automated Performance Modeling
I didn't go to Art School
Users
Backend
API
OS
Web App
Frontend
Open MCT
YAMCS
CouchDB
k8s
Docker
Bearer of Bad News
BE Issues = FE Issues
Users
Backend
API
OS
Web App
Frontend
Happy Accidents
Frontend Changes = Backend Problems
- Google found that when they increased the search results from 10 to 30, the load time increased by half a second and resulted in a 20% decrease in ad revenues.
- Intentional Product Change
- One Change. Everyone's problem.
PDM Changes = BE Problems
Users
Backend
API
OS
Web App
Frontend
UX Performance
- Twitter "Time to First Tweet"
- User Timing API
- Performance.marks() API
performance.mark('search-entered');
### Some number of clicks or steps ###
performance.mark('search-returned');
performance.measure('total-search-time','search-entered,'search-returned');
###returned object
PerformanceMeasure {
name: 'total-search-time',
entryType: 'measure',
startTime: 4727964.299999952,
duration: 12436.700000047684
}
Open MCT Search Demo
- openmct-quickstart
- Universal Search
- "Guess the traffic"
- HAR File
github.com/scottbell/openmct-quickstart
Open MCT Performance
- How do we prevent Open MCT from just being the bearer of bad news?
- openmct-performance is an "example repo" on how to write a performance test with Open MCT
- Companion to openmct-quickstart
- 'docker compose up'***
Silos & Asterisks
- FE devs silo'd from the performance testing and infra teams
- Performance test tooling can't run in prod/QA/Staging
- Synthetic User Monitoring is fragile and owned by the wrong team
Performance Test Results
from browserstack.com/speedlab
- Baseline Function
Keep k6-browser tests in sync with development
- UI Locator Contract Tests
Playwright Locator Changes ->
openmct PRs -> openmct-performance
### Automated e2e test run on every PR
await page.locator('[aria-label="OpenMCT Search"] input[type="search"]').click();
### Automated k6 test run downstream
await page.locator('[aria-label="OpenMCT Search"] input[type="search"]').click();
- Community Shoutout!
- Playwright Functional + k6-browser shared repo!
- https://github.com/ticup/xk6-browser-template-typescript-playwright
Summary
- Performance Modeling
- Open MCT Performance
- Integrated Frontend and Backend Performance
- How to Keep Frontend Performance In Sync With Development
Thanks!
Contact and Links
openmct
openmct-performance
k6-browser
Cutting Room Floor
Synthetic User Monitoring
- Recent Practice of Automated Performance Testing
- "SRE Team running Selenium tests"
- Why? The frontend is bearer of bad news!
- Critical User Journeys covered
- Comes with a lot of asterisks
First Load | Loaded Page | Session |
---|
Network Fix - Control it
- Inside the browser with CDP access
- Inject CDP Network Profiles
const client = await page.target().createCDPSession();
await client.send('Network.enable');
await client.send('Network.emulateNetworkConditions', {
# 5% Decrease in Ad revenue
latency: 500
});
Network Control Pt 2
- Once you're in the browser, you control it all
- Mock Network Responses with API Interception
- Note: Couldn't Replace 10 -> 30 results :(
await page.route('**/*.png', route => {
route.fulfill({
body: './bing.png'
});
});
Synthetic User Monitoring Challenges (Part Deux)
And lastly, many synthetic monitoring tools lack the context needed to explain why a specific failure happened or what the business implications might be, lengthening time to resolution and making it unnecessarily difficult to prioritize application performance issues."
dynatrace.com/news/blog/what-is-synthetic-monitoring/
***** Asterisk
xk6-browser
- Open Source
- k6 driving playwright APIs
- beta
- supports .connectOverCDP()
What it solves:
*Generate many, many
measurements!
** Tap into the huge k6 ecosystem to integrate and correlate with System Monitoring, Load Testing, etc
Creating a baseline of measurement for transferability
- Let’s get rid of all asterisks associated with our measurment.
- Create a HTML Webpage.
- Test and Time the page.
- Demo
- Run this before every run. Run between version changes. Run this local vs CI.
Summary
- Browser Performance and DevTools
- Load Testing and Perf Problems
- What to Measure
- Browserless
- Driving with Puppeteer
- js-perf-toolkit
Demo
What do we need?
- Stable Interface
- Controlled Environment
Demo
What do we need?
- Stable Interface
- Playwright "bless your own damn build" testing
- sitespeed.io is a better specialized tool
Hard?
- Variability
- We "only" want App-under-test variation
- Network
- CPU
- Chrome
- Test Framework
- Transferability
- CPU/GPU
- QA Team giving us bad builds!
"Hard" Demo
Capabilities Demo
JS-Perf-Toolkit
- github.com/unlikelyzero/js-perf-toolkit
- Moves everything* into containers
- browserless
- influxdb+prometheus+grafana
- Puppeteer/Playwright Examples
- NodeJS->InfluxDB (coming soon)
- xk6-browser*
- Network control with toxiproxy
- Integrates into monitoring systems with InfluxDB and Prometheus*
Why?
- Load Testing is what you do to the backend to approximate Rough Response Time
- HTTP Rest API can provide a rough estimate to User Experiences in the app
Selenium?
- Yes? No? Maybe?
- Variability in results due to waitFor
- Driver? Jmeter+Selenium
- 2 Cores per User!
- CDP in 4.0
How?
(Chrome! Devtools!)
Our first asterisk * !
Load Testing Can't Properly Approximate Frontend Performance Regressions
Chrome
- Loads and renders your web app via static assets, javascript, and APIs
- Web app / javascript
- How well it runs on your machine
- Static Assets
- You can't
- But!
- HTTP 1.1 only allows 6 sequential API responses
- Websockets?
- Graphql queries based on user data?
Network Variability*
- Network is defined by Latency and number of requests necessary to provide User the needed data to present on screen
- Variability in the internet and local machine
Why is *frontend* performance important?
- Amazon found that every 100 milliseconds in load time correlated to 1% decrease in sales.
- Google found that when they increased the search results from 10 to 30, the load time increased by half a second and resulted in a 20% decrease in ad revenues.
- Load time and User Experience
Quick Check-in
- Why performance test
- Load times and lighthouse
- What to look for
- After load and Long Tasks
- RUM, User Timing, and marks
How did Google know the users left?
Google found that when they increased the search results from 10 to 30, the load time increased by half a second and resulted in a 20% decrease in ad revenues."
Real User Monitoring (RUM)
- "Where real users go and what they do"
- Inject js code into application to report back your marketing team
- Session Replay
- Key Takeaway for Performance Testing
-
- User Timing
- Performance.marks()
What happens after the first load?
What happens after the first load?
Copy of k6
By John Hill
Copy of k6
Front-end performance testing is hard. Really hard. There are hundreds of variables that affect end users’ perceived performance. Only a few are measured with traditional load testing tools. Few can be actively controlled outside of a dedicated test environment, and we lose credibility as soon as our tests leave that environment. Worse yet, the available front-end performance tooling blindly focuses on how quickly a page loads. What happens after the first load? At NASA Ames, our Mission Operators have an 18-hour shift. Then there is automation, none of those tools were designed to be automated like our e2e tests. Adding all this up means that, as performance testers, we’re required to add too many asterisks to our results… until now! In this live demo, we’ll detail the metrics that matter and how to measure them without asterisks. Using Browserless, Playwright, and k6, we’ll instrument and automate a performance test. We’ll have a front-end performance tooling state-of-the-union to outline what’s happening in this space and where we’re going. Lastly, we’ll end with ways to integrate these new tools into your existing CI/CD process and test frameworks.
- 56