Running A Thousand End-To-End Cypress Tests Every Day

Gleb Bahmutov

Climate Crisis Is Bad

gleb.dev

Join others to fight the climate crisis

Speaker: Gleb Bahmutov PhD

🦋 bahmutov.bsky.social

gleb.dev

github.com/bahmutov

glebbahmutov.com/blog

C / C++ / C# / Java / CoffeeScript / JavaScript / Node / Angular / Vue / Cycle.js / functional programming / testing

www.youtube.com/glebbahmutov

🌎 🔥 350.org 🌎 🔥 citizensclimatelobby.org 🌎 🔥

https://cypress.tips/courses

gleb.dev

Gleb Bahmutov

Sr Director of Engineering

gleb.dev

EveryScape

MathWorks

Kensho

Cypress.io

75 → 20

2000

8 → 100

5 → 50

150 (2000)

The same CTO

Gleb, we need our stuff to work.

Web app
ReactNative mobile app
APIs
Special projects

Agenda

Test speed
Fast(er) CI feedback
Test tags
Everyone is involved
Future is 🔆

https://slides.com/bahmutov/1000-e2e

slides.com/bahmutov/testing-large-org

slides.com/bahmutov/react-next

slides.com/bahmutov/decks/mercari

If you like it,

Put a 💍 test on it...

Gleb ~~Beyonce~~ Bahmutov

gleb.dev

A typical Mercari US Cypress E2E test

cy.signup(seller)

cy.createListing({
  name: `Macbook one ${Cypress._.random(1e10)}`,
  description: 'Seller will delete all items',
  price: 198,
})
cy.createListing({
  name: `Macbook two ${Cypress._.random(1e10)}`,
  description: 'Seller will delete all items',
  price: 199,
})

visitBlankPage()
cy.loginUserUsingAPI(seller)
cy.visitProtectedPage('/mypage/listings/active')
cy.byTestId('Filter', 'Active').should('be.visible').and('contain', '2')
cy.byTestId('ListingRow').should('be.visible').and('have.length', 2)

A typical Cypress test

cy.signup(seller)

cy.createListing({
  name: `Macbook one ${Cypress._.random(1e10)}`,
  description: 'Seller will delete all items',
  price: 198,
})
cy.createListing({
  name: `Macbook two ${Cypress._.random(1e10)}`,
  description: 'Seller will delete all items',
  price: 199,
})

visitBlankPage()
cy.loginUserUsingAPI(seller)
cy.visitProtectedPage('/mypage/listings/active')
cy.byTestId('Filter', 'Active').should('be.visible').and('contain', '2')
cy.byTestId('ListingRow').should('be.visible').and('have.length', 2)

Pull request template

A typical Cypress test

Keep the tests readable
- custom commands, utilities, plugins

Keep the tests readable
- custom commands, utilities, plugins
Do as much as possible via API calls
- login, add an address, add a credit card, create a listing, etc

https://www.youtube.com/watch?v=ubnJ9kWD1yQ

Keep the tests readable
- custom commands, utilities, plugins
Do as much as possible via API calls
- login, add an address, add a credit card, create a listing, etc
Cache created data
- users, listing

https://slides.com/bahmutov/flexible-cypress-data

Keep the tests readable
- custom commands, utilities, plugins
Do as much as possible via API calls
- login, add an address, add a credit card, create a listing, etc
Cache created data
- users, listing
Keep a test shorter than 3 minutes
- Keep each spec shorter than 3 minutes
- Use data-driven testing

https://github.com/bahmutov/cypress-each

const searches = ['Wearable', 'Running shoes', 'Dolls']

it.each(searches)(
  `Filters results by status for search: %s`,
  (searchKeyword) => {
    const url = formSearchUrl({ searchKeyword })
    cy.visitProtectedPage(url)
    ...
  })

gleb.dev

Text

Why E2E Tests

when we have unit tests?

gleb.dev

It Is

A Question

Of Scale

gleb.dev

1303 tests * 1 minute/test

≅

22 hours to run all the tests

How to run all the tests?
How to run the tests for PRs?

Parallelize all the things

E2E tests finish in 39 minutes using 20 CI machines

https://on.cypress.io/parallelization

# .github/workflows/nightly.yml
nightly-tests:
  needs: [prepare]
  container: cypress/browsers:node-22.20.0-chrome-141.0.7390.107-1-ff-144.0-edge-141.0.3537.85-1
  strategy:
    fail-fast: false
	matrix: ${{fromJSON(needs.prepare.outputs.matrix)}}
  steps:
    - name: Cypress nightly tests 🧪
      uses: cypress-io/github-action@6.6.1
      with:
        record: true
        parallel: true

Use the Cypress GitHub Action

💻

Dev Environment

💻

🔥 Dev Environment 🔥

🔥 🔥 🔥 🔥 🔥

💻

(cannot run 1000 tests in parallel)

We have to prioritize some spec files

gleb.dev

Work locally on a single feature spec

gleb.dev

Push code to run E2E tests on CI

gleb.dev

Pull Request Flow

Web Repo

E2E Repo

PR deploy

Trigger E2E tests

"How to Keep Cypress Tests in Another Repo While Using CircleCI"

https://glebbahmutov.com/blog/how-to-keep-cypress-tests-in-another-repo-with-circleci

"How to Keep Cypress Tests in Another Repo While Using GitHub Actions"

https://glebbahmutov.com/blog/how-to-keep-cypress-tests-in-another-repo/

Pull Request Flow

Web Repo

E2E Repo

PR deploy

Trigger E2E tests

new / changed

specs

Pull Request Flow

Web Repo

E2E Repo

PR deploy

Trigger E2E tests

💻

CI machines

PR preview environments are isolated (GOOD), but not very powerful even compared to the DEV environment (ughh)

new / changed

specs

Find the changed specs and run them first

# https://github.com/bahmutov/find-cypress-specs
specs=$(npx find-cypress-specs --branch main --parent)
n=$(npx find-cypress-specs --branch main --parent --count)

if [ ${n} -lt 1 ]; then
  echo "No Cypress specs changed, exiting..."
  exit 0
fi

npx cypress run --record --parallel --spec ${specs}

@bahmutov

If changed specs pass, run all or some E2E tests

https://glebbahmutov.com/blog/pick-tests-using-pull-request/

Bonus 💯

Pick / run tests based on frontend code changes
- https://glebbahmutov.com/blog/using-test-ids-to-pick-specs-to-run/
Pick / run tests based on visited pages
- https://glebbahmutov.com/blog/run-cypress-tests-for-the-given-url/
Pick / run tests based on API calls
- https://glebbahmutov.com/blog/pick-tests-by-network-calls/

Test Tags

gleb.dev

🎓 Learn more:

https://github.com/bahmutov/cy-grep

https://glebbahmutov.com/blog/tag-tests/

https://github.com/bahmutov/find-cypress-specs

describe('Shipping', { tags: '@shipping' }, () => {
  it(
    'C1234 uses the default Mercari shipping',
    { tags: ['@sanity', '@regression', '@mobile'] },
    () => {
      ...
    }
  )   
})

describe('Shipping', { tags: '@shipping' }, () => {
  it(
    'C1234 uses the default Mercari shipping',
    { tags: ['@sanity', '@regression', '@mobile'] },
    () => {
      ...
    }
  )   
})

Effective tags

@shipping, @sanity

@regression, @mobile

🎓 Learn more:

https://github.com/bahmutov/cy-grep

https://glebbahmutov.com/blog/tag-tests/

https://github.com/bahmutov/find-cypress-specs

describe('Shipping', { tags: '@shipping' }, () => {
  it(
    'C1234 uses the default Mercari shipping',
    { tags: ['@sanity', '@regression', '@mobile'] },
    () => {
      ...
    }
  )   
})

$ find-cypress-specs --tags

Tag          Tests
-----------  -----
@balance     4    
@careers     2    
@helpcenter  69   
@local       19   
@login       11   
@messaging   8    
@mobile      77   
@moble       1    
@offer       6    
@payment     34   
@profile     55   
@regression  72   
@sanity      24   
@search      43   
@sell        76   
@shipping    24   
@signup      12   
@w9          10

Effective tags

@shipping, @sanity

@regression, @mobile

🎓 Learn more:

https://github.com/bahmutov/cy-grep

https://glebbahmutov.com/blog/tag-tests/

https://github.com/bahmutov/find-cypress-specs

@profile

@sell

@login

@shipping

@payment

@sanity

@regression

all

@...

gleb.dev

Run all tests 3x a day
Run each test tag in its own workflow once a day
Run high priority test tags like @payment and @shipping 2-3x a day

gleb.dev

@sanity, @regression, all

features

Trigger tests from GitHub UI

https://glebbahmutov.com/blog/cypress-grep-filters

https://glebbahmutov.com/blog/pick-tests-using-pull-request/

gleb.dev

Web Repo

PR deploy

Trigger

tests

automatically

API1 Repo

PR deploy

Trigger

tests

manually

API2 Repo

PR deploy

Service X

PR deploy

YOU can run the tests

gleb.dev

Custom /cypress command in the PR comment for all repos

gleb.dev

/cypress tags=@shipping,@profile

gleb.dev

Test Tags AI 🤖

gleb.dev

AI, what is the best test tag(s) for this pull request?

Flake Is Bad

gleb.dev

Fighting flaky tests

Enable test retries, look at the Cypress Dashboard

https://on.cypress.io/flaky-test-management

Fighting flaky tests

Enable test retries, look at the Cypress Dashboard
Increase command timeouts where necessary
Command - assertion pattern

https://on.cypress.io/flaky-test-management

Fighting flaky tests

spy on GraphQL calls 🎉

https://on.cypress.io/flaky-test-management

"Directly Spying on GraphQL Calls Made By The Application" https://www.youtube.com/watch?v=XadOqS0YNJE

spy on GraphQL calls 🎉

"Set GraphQL Operation Name As Custom Header And Use It In cy.intercept" https://www.youtube.com/watch?v=AcU5mkedchM deserves a lot more ❤️

Prevent Flaky Tests:

Burn New Tests

CYPRESS_burn=5 npx cypress run ...

https://glebbahmutov.com/blog/burning-tests/

Burn Changed / New Tests

Where are we going...

Automated E2E Tests

Manual E2E Tests

API / unit tests

cheaper!

faster!

never tired!

Cypress demo sessions
Writing example tests for any discovered errors
Writing and sharing a LOT of blog posts / examples

What we did:

Test replays via Cypress Cloud

Web team is writing / updating tests
AI is writing / updating tests
Everyone looks at web test runs and results

What we have achieved:

We now catch bugs the same day or even before they are merged and deployed

recent Mercari US QA team meeting

https://slides.com/bahmutov/1000-e2e

Running A Thousand End-To-End Cypress Tests Every Day

In this talk, I show how we run a lot of full end-to-end Cypress web application tests every day. In addition to running the full data set, we do separate feature test runs based on test tags. We also allow everyone from all teams to trigger the tests right from GitHub Actions UI. This lets every group quickly test their feature before merging into the main branch. For pull requests, we employ source code analysis based on data test IDs to run the affected tests first for quicker feedback. The software automation team uses the flake test information to chase the sources of the underlying errors to minimize noise and make every passing test run give us confidence in the released code, and every failing test run useful to quickly diagnose the real underlying issue. The presentation covers test writing, test organization, selecting tests to run based on the source code changes, running tests in different resolutions. I also look into making the tests faster by employing data creation and caching, as well as using API calls to bypass the user interface in some places. Finally, making the tests robust and flake-free and triaging the failed runs is an ongoing activity for the automation team. Key takeaways: - How to run 1000 of end-to-end tests quickly - Which tests to run on a pull request - How AI is helping us pick tests to run. Presented at Nordic Testing Days 2026, 40 minutes

Gleb Bahmutov PRO

JavaScript ninja, image processing expert, software quality fanatic

glebbahmutov.com

Running A Thousand End-To-End Cypress Tests Every Day

Gleb Bahmutov

gleb.dev

Climate Crisis Is Bad

Join others to fight the climate crisis

Speaker: Gleb Bahmutov PhD

Gleb Bahmutov

Sr Director of Engineering

EveryScape

MathWorks

Kensho

Cypress.io

The same CTO

Gleb, we need our stuff to work.

Web app

ReactNative mobile app

APIs

Special projects

Agenda

Test speed

Fast(er) CI feedback

Test tags

Everyone is involved

Future is 🔆

Why E2E Tests

when we have unit tests?

It Is

A Question

Of Scale

Parallelize all the things

We have to prioritize some spec files

Work locally on a single feature spec

Push code to run E2E tests on CI

Pull Request Flow

Pull Request Flow

Pull Request Flow

Find the changed specs and run them first

If changed specs pass, run all or some E2E tests

Bonus 💯

Pick / run tests based on frontend code changes

https://glebbahmutov.com/blog/using-test-ids-to-pick-specs-to-run/

Pick / run tests based on visited pages

https://glebbahmutov.com/blog/run-cypress-tests-for-the-given-url/

Pick / run tests based on API calls

https://glebbahmutov.com/blog/pick-tests-by-network-calls/

Test Tags

Run all tests 3x a day

Run each test tag in its own workflow once a day

Run high priority test tags like @payment and @shipping 2-3x a day

Trigger tests from GitHub UI

YOU can run the tests

Custom /cypress command in the PR comment for all repos

Test Tags AI 🤖

Flake Is Bad

Fighting flaky tests

Enable test retries, look at the Cypress Dashboard

Fighting flaky tests

Enable test retries, look at the Cypress Dashboard

Increase command timeouts where necessary

Command - assertion pattern

Fighting flaky tests

spy on GraphQL calls 🎉

spy on GraphQL calls 🎉

spy on GraphQL calls 🎉

Prevent Flaky Tests:

Burn New Tests

Burn Changed / New Tests

Burn Changed / New Tests

Where are we going...

Cypress demo sessions

Writing example tests for any discovered errors

Writing and sharing a LOT of blog posts / examples

What we did:

Web team is writing / updating tests

AI is writing / updating tests

Everyone looks at web test runs and results

What we have achieved:

We now catch bugs the same day or even before they are merged and deployed

gleb.dev

Running A Thousand End-To-End Cypress Tests Every Day