Gleb Bahmutov PRO
JavaScript ninja, image processing expert, software quality fanatic
https://lizkeogh.com/2019/07/02/off-the-charts/
+3 degrees Celsius will be the end.
aka "You are always losing customers"
aka "You are always losing customers"
💵
💰
👀
🧑💻
🦸
aka "You are always losing customers"
10-20%
aka "You are always losing customers"
10-20%
What if you converted all first time visitors to paying customers?
aka "You are always losing customers"
10-20%
What if you converted all first time visitors to paying customers?
You would increase your revenue 5 or 10 times!
aka "You are always losing customers"
Why the drop outs?
This tool does not do what I need
It does not work on my platform
Valid reasons
aka "You are always losing customers"
Why the drop outs?
This tool does not do what I need
It does not work on my platform
Valid reasons
This tool does not do what I need (but it does)
Does not work on my platform (but it does work)
Wrong reasons
aka "You are always losing customers"
Why the drop outs?
This tool does not run on my CI
Valid reasons
I don't know how to make this tool work on my CI
Wrong reasons
aka "You are always losing customers"
Why the drop outs?
I cannot afford this tool
Valid reasons
I don't know what value this tool gives me
Wrong reasons
aka "You are always losing customers"
Why the drop outs?
I am happy with this tool, that's enough for me
Valid reasons
I did not know this tool could do that
Wrong reasons
aka "You are always losing customers"
Why the drop outs?
Market fit
Features
Bug fixes
aka "You are always losing customers"
Why the drop outs?
Market fit
Features
Bug fixes
Bad UI
Bad docs
Lack of examples
aka "You are always losing customers"
Why the drop outs?
Market fit
Features
Bug fixes
Bad UI
Bad docs
Lack of examples
Goal: drive the "bad" area to zero
aka "You are always losing customers"
If you avoid losing users in this area, you would increase your revenue 2 or 5 times!
Bad UI
Bad docs
Lack of examples
Goal: drive the "bad" area to zero
Bad UI
Bad docs
Lack of examples
Market fit
Features
Bug fixes
Documentation makes or breaks projects
Information for every user persona
Is there a way to disable command snapshots?
How to preserve session cookie?
How to check the length of text?
Every question here is a failure of the documentation
Every question here is probably a:
all well-defined questions across chat, support emails, github issues
questions that have answer in your documentation
* 100%
(many things are not documented)
(users cannot find the right documentation)
💥
"show me Hello, World!"
"show me the changelog diff from version X to Y"
"show me a tutorial"
"show me how to do X"
"how do I solve my issue or bug?"
Then add all your documentation to the site reachable from the index
Hint: look at the search results for "about" query
to add records to the index
Keep Admin API key private!
to search index from site
(public)
{
"index_name": "scrape-test",
"start_urls": ["https://glebbahmutov.com/triple-tested/"],
"selectors": {
"lvl0": {
"selector": ".site-name",
"global": true
},
"lvl1": ".content__default h1",
"lvl2": ".content__default h2",
"lvl3": ".content__default h3",
"lvl4": ".content__default h4",
"lvl5": ".content__default h5",
"text": ".content__default p, .content__default li"
}
}
Algolia config (JSON)
# when scraping the site, inject secrets as environment variables
# then pass their values into the Docker container using "-e" syntax
# and inject config.json contents as another variable
- name: scrape the site 🧽
env:
APPLICATION_ID: ${{ secrets.APPLICATION_ID }}
API_KEY: ${{ secrets.API_KEY }}
run: |
docker run \
-e APPLICATION_ID -e API_KEY \
-e CONFIG="$(cat config.json)" \
algolia/docsearch-scraper:v1.6.0
use Algolia Docker image
# when scraping the site, inject secrets as environment variables
# then pass their values into the Docker container using "-e" syntax
# and inject config.json contents as another variable
- name: scrape the site 🧽
env:
APPLICATION_ID: ${{ secrets.APPLICATION_ID }}
API_KEY: ${{ secrets.API_KEY }}
run: |
docker run \
-e APPLICATION_ID -e API_KEY \
-e CONFIG="$(cat config.json)" \
algolia/docsearch-scraper:v1.6.0
use Algolia Docker image
blog
examples
{
"index_name": "cypress",
"start_urls": [
{
"url": "https://docs.cypress.io/",
"page_rank": 10
},
{
"url": "https://example.cypress.io/",
"selectors_key": "kitchensink",
"tags": ["example"],
"page_rank": 2
},
{
"url": "https://www.cypress.io/blog/",
"selectors_key": "blog",
"tags": ["blog post"],
"page_rank": 1
}
],
"stop_urls": [
"^https://docs.cypress.io/ja/",
"^https://docs.cypress.io/zh-cn/",
"^https://docs.cypress.io/pt-br/",
"^https://docs.cypress.io/ru/"
],
"selectors_exclude": [],
"selectors": {
"default": {
"lvl0": "article h1.article-title",
"lvl1": "article h1.article-heading",
"lvl2": "article h2.article-heading",
"text": "article .article-content p, article .article-content tr, article .article-content li, article .article-content pre"
},
"blog": {
"lvl0": "article h1",
"lvl1": "article h2",
"text": "article p, article pre"
},
"kitchensink": {
"lvl0": ".container h1",
"lvl1": ".container h4",
"text": ".container p, .container pre"
}
},
"nb_hits": 0,
"min_indexed_level": 2
}
Algolia config (JSON)
2 minutes to scrape, creates ~ 21k records
<ais-search-box
// Optional parameters
placeholder="string"
submitTitle="string"
resetTitle="string"
[searchAsYouType]="boolean"
[autofocus]="boolean"
></ais-search-box>
Let's test the search
describe('Angular Doc Search', () => {
it('shows native results', () => {
cy.visit('/', {
onBeforeLoad(win) {
// ServiceWorker messes up with the page load
delete win.navigator.__proto__.serviceWorker
}
})
// delay each keystroke for the demo
cy.get('input[aria-label=search]').type('testing', {delay: 70})
// six search results columns
cy.get('.search-section-header').should('have.length', 6)
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.find('.search-page').should('have.length.gte', 3)
})
})
Let's test the search
a "blind" test guesses the search term and results
describe('Angular Doc Search', () => {
it('shows native results', () => {
cy.visit('/', {
onBeforeLoad(win) {
// ServiceWorker messes up with the page load
delete win.navigator.__proto__.serviceWorker
}
})
// delay each keystroke for the demo
cy.get('input[aria-label=search]').type('testing', {delay: 70})
// six search results columns
cy.get('.search-section-header').should('have.length', 6)
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.find('.search-page').should('have.length.gte', 3)
})
})
Let's test the search
describe('Angular Doc Search', () => {
it('shows native results', () => {
cy.visit('/', {
onBeforeLoad(win) {
// ServiceWorker messes up with the page load
delete win.navigator.__proto__.serviceWorker
}
})
// delay each keystroke for the demo
cy.get('input[aria-label=search]').type('testing', {delay: 70})
// six search results columns
cy.get('.search-section-header').should('have.length', 6)
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.find('.search-page').should('have.length.gte', 3)
})
})
Let's test the search
describe('Angular Doc Search', () => {
it('shows native results', () => {
cy.visit('/', {
onBeforeLoad(win) {
// ServiceWorker messes up with the page load
delete win.navigator.__proto__.serviceWorker
}
})
// delay each keystroke for the demo
cy.get('input[aria-label=search]').type('testing', {delay: 70})
// six search results columns
cy.get('.search-section-header').should('have.length', 6)
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.find('.search-page').should('have.length.gte', 3)
})
})
Let's test the search
it('shows single search result', () => {
// https://on.cypress.io/intercept
cy.intercept('/search-data.json', { fixture: 'single-result.json' })
cy.visit('/', {
onBeforeLoad(win) {
// ServiceWorker messes up with the page load
delete win.navigator.__proto__.serviceWorker
}
})
// delay each keystroke for the demo
cy.get('input[aria-label=search]').type('testing', {delay: 70})
})
[{
"headingWords": "testing",
"keywords": "testing unit component e2e",
"path": "cli/test",
"title": "Testing is fun",
"titleWords": "testing is fun",
"type": "content"
}]
cypress/fixtures/single-result.json
cy.get('input[aria-label=search]')
.type('testing', { delay: 70 })
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.contains('.search-page', 'Testing is fun')
.click()
cy.get('input[aria-label=search]')
.type('testing', { delay: 70 })
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.contains('.search-page', 'Testing is fun')
.click()
cy.location('pathname').should('equal', '/cli/test')
name: ci
on: [push]
jobs:
cypress-run:
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v1
# Install NPM dependencies, cache them correctly
# and run all Cypress tests
- name: Cypress run
uses: cypress-io/github-action@v2
🎉 Break is Over 🎊
cy.get('input[aria-label=search]')
.type('testing', { delay: 70 })
cy.contains('.search-section-header', 'cli')
.parent('.search-area')
.contains('.search-page', 'Testing is fun')
.click()
cy.location('pathname')
.should('equal', '/cli/test')
Hardcoded test data
The same data is in the fixture JSON file ...
import singleResult from '../fixtures/single-result.json'
cy.intercept('/search-data.json', singleResult)
cy.visit()
const {headingWords, title, path} = singleResult[0]
cy.get('input[aria-label=search]')
.type(headingWords, { delay: 70 })
cy.get('.search-section-header')
.parent('.search-area')
.contains('.search-page', title)
.click()
cy.location('pathname')
.should('equal', '/' + path)
Load test data from a fixture file & intercept too
cy.intercept('/search-data.json').as('search')
cy.visit()
spy on network call
cy.intercept('/search-data.json').as('search')
cy.visit()
cy.wait('@search').its('response.body')
spy on network call
cy.wait('@search').its('response.body')
.then(list => {
return Cypress._.find(list, { title: 'Accessibility in Angular' })
})
.then(result => {
expect(result).to.be.an('object')
})
.then(result => {
expect(result).to.be.an('object')
const { headingWords, title, path } = result
const search = headingWords.split(' ')[0]
// delay each keystroke for the demo
cy.get('input[aria-label=search]')
.type(search, { delay: 70 })
cy.contains('.search-page a', title).click()
cy.location('pathname')
.should('equal', '/' + path)
})
code comments are not indexed 😟
{
"index_name": "cypress-examples",
"start_urls": ["https://glebbahmutov.com/cypress-examples/"],
"selectors": {
"lvl0": {
"selector": ".site-name",
"global": true
},
"lvl1": ".content__default h1",
"lvl2": ".content__default h2",
"lvl3": ".content__default h3",
"lvl4": ".content__default h4",
"lvl5": ".content__default h5",
"text": ".content__default p, .content__default li, .content__default pre .comment"
}
}
algolia-config.json
scrape paragraphs AND list items AND code comments
{
"index_name": "cypress-examples",
"start_urls": ["https://glebbahmutov.com/cypress-examples/"],
"selectors": {
"lvl0": {
"selector": ".site-name",
"global": true
},
"lvl1": ".content__default h1",
"lvl2": ".content__default h2",
"lvl3": ".content__default h3",
"lvl4": ".content__default h4",
"lvl5": ".content__default h5",
"text": ".content__default p, .content__default li, .content__default pre .comment"
}
}
algolia-config.json
Tip: use $$(selector) in DevTools
⚠️ free Algolia plans only get weekly emails with such queries
write docs for these searches!
Browse slides https://slides.com/bahmutov/testing-docs
Look at code https://github.com/bahmutov/testing-angular-docs-search
Read https://glebbahmutov.com/blog/scrape-static-site-with-algolia/
Follow @bahmutov
Act: on climate
By Gleb Bahmutov
Good documentation with powerful search is the key to the project's success with users. I will show how to configure Algolia search to scrape your site, and how to test the search using Cypress.io test runner. Presented at AngularUp in Nov 2020. Video at https://www.youtube.com/watch?v=cqhV8UbT5LQ
JavaScript ninja, image processing expert, software quality fanatic