Scaling Selenium Beyond Selenium Grid (Abbreviated)

John Hill
Senior Automation Engineer - Ansible by Red Hat by IBM?

Who am I?

Who am I?

I don't work for:

  • Zalando
  • Aerokube
  • Browserstack
  • Saucelabs
  • Testingbot


  •  

Overview

  • Start with a live demo! (What could go wrong?)
  • What does it mean to Scale Selenium?
  • One of Two Scaling Strategies
  • Selenium Grid
  • How to go beyond Selenium Grid
  • How Selenium 4 will help us scale

Basic Selenium Test

Live Demo!

Written in NightwatchJS

Code available here:
https://github.com/unlikelyzero/selenium-scaling-demo

 

What is the "unit" of Selenium to scale?

Browser

Wrap the browser in a Containerized Selenium image

Selenium Docker Images

seleniumhq/Docker-selenium  
3600 Github Stars, ~100M DockerHub Pulls

 

elgalu/docker-selenium
1200 Github Stars, 32M DockerHub Pulls

 

aerokube/selenoid-images
​80 Github Stars, 13M Docker Pulls

What's inside?

Small Detour

Dependency Heck

  • House of Cards
  • Hot issues
    • Se can't run on Java9
  • Issues uncovered building demo
    • selenium/issues/5674
    • elgalu/docker-selenium/issues/201

"It works on my machine"

Title Text

What does it mean to scale selenium testing?

Parallelism

Many browsers at the same time

Demo

  • Serial
  • Local Parallel
  • Remote Parallel
  • Remote Parallel++

Selenium Scaling Constraints

CPU & RAM

  • ~ 1 CPU/Thread per session
  • ~ 1000m in k8s/Openshift
  • ~1 GB of RAM per session
  • (More if recording video)

Network

  • Selenium traffic is http traffic
  • Network Latency between each System (Jenkins / Chrome / Application-under-test) directly affects the speed of a test
  • Each Test generates between 1-5 Requests Per Second (RPS)
  • Up to 20 RPS

Quick Recap

  • Containerized Selenium
  • Scaling = Parallel Sessions
  • Scaling Constraints

How do we get to 1000 parallel browsers?

Anatomy of a Selenium Test in CI

External Selenium Cluster

External Grid Scaling Path

Cons

  • Network latency
  • Sessions lose durability

Pros

  • Local = CI. Literally.
  • Run locally "at scale"
  • Separated concern
  • Independent resources
  • Many teams and frameworks can share resources

Selenium Grid

  • 2008
  • 1 Hub for Many Nodes
  • Single target for all tests

Who has maintained a Selenium Grid before?

Problems with Se Grid

  • Scales to about 150
  • Single Point of Failure
  • Not Highly Available or Fault Tolerant
  • No Session Durability
  • Hub must be configured, installed and hosted
  • No "Autodiscovery" of Se Nodes
  • Parasitic Network Traffic
  • Dependency Heck++
  • Lack of Observability

Selenium Grid

... the world has moved on since we wrote the original grid implementations. When we started writing grid, machines were underpowered, had limited memory, and SSDs didn’t really exist. Virtual machines were provided by VMWare and getting any real density of these required absolutely huge servers that few could afford. Consequently, we used multiple machines, self-hosted in racks rooms and local data centres. This lead to a really simple design for grid, which has allowed it to continue functioning relatively well to this day." - Simon Stewart

Selenium Grid

Solutions

Alternatives

Zalenium

  • zalando/zalenium
  • Github Stars 1485, 5M+ Dockerhub Pulls
  • Designed to work with elgalu/docker-selenium
  • Dockerized Selenium Grid Extra on Steroids

Zalenium "Creates" Docker-Selenium Nodes

Pros

  • All-in-one solution
  • "Elastic" Nodes
  • Works as a generic Hub with any Se Node
  • Proxies to SaaS
  • Native k8s/Openshift support
  • Observability
  • Scales to 150 Nodes

Cons

  • Scales to 150 Nodes
  • Not Highly Available
  • No session durability
  • Java-based
  • Only chrome and firefox latest available*

Selenoid

  • aerokube/selenoid
  • Github Stars 1400, 3.8M Dockerhub Pulls
  • Designed to work with aerokube/selenoid-images
  • Complete rewrite of Grid in Go

Pros

  • <Similar to Zalenium>
  • Supports many versions of firefox and chrome
  • Scales past 150*
  • Consumes 10 times less memory than Java-based Selenium server under the same load

Cons

  • Not Highly Available
  • No session durability
  • no k8s/openshift
  • *Limited by size of VM or docker machine

How do we get to 1000 Browsers?

GoGridRouter

  • aerokube/ggr
  • Golang version of GridRouter
  • 187 Github Stars, 105k Docker Pulls
  • An active load balancer for Selenium traffic

GGR

Pros

  • Intelligent Load balancing
  • Works with any Selenium Hubs
  • Works with SaaS
  • Enables Highly Available clusters*

Cons

  • Not Fault Tolerant
  • Multiple GGR Nodes and a LB needed for HA
  • Issues with 80% utilization
  • Complexity

GoGridRouter Complexity

Alternatively...

Moon

  • aerokube/moon
  • k8s native
  • Selenoid+GGR

Pros

  • Simple config
  • Perfect Load distribution
  • Elastic!
  • Pay-as-you-go
  • 1000 Sessions with single cluster

Cons

  • Closed Source
  • Not FOSS
  • Not Fault Tolerant

One last demo

Still need Session Durability

Selenium 4

When?

  • Christmas 2018
  • Chinese New Year 2019
  • SeleniumHQ/selenium/projects/2
  • (Opinion) Slowed by backwards compatibility
  • All new Grid Features may not land at launch

Grid 4.0

  • Rewrite
  • 7 Components
    • Edge Router + Distributor = GGR
    • Session Storage = Fault Tolerance via Redis

Summary 

  • (Hopefully) Demonstrated a Selenium Test
  • Scaling Selenium
  • Selenium Grid
  • Selenium Grid Alternatives
  • Selenium 4