Performance
Sites have more features than ever before.
So much so, that many sites now struggle to achieve a high level of performance across a variety of network conditions and devices.
Performance is about retaining users
Performance is about the user experience
Performance is about people
Poorly performing sites and applications can also pose real costs for the people who use them.
As mobile users continue to make up a larger portion of internet users worldwide, it's important to bear in mind that many of these users access the web through mobile LTE, 4G, 3G and even 2G networks.
Mind what resources you send
Mind how you send resources
Migrate to HTTP/2. HTTP/2 addresses many performance problems inherent in HTTP/1.1, such as concurrent request limits and the lack of header compression.
Mind how much data you send
<img
srcset="
/wp-content/uploads/flamingo4x.jpg 4x,
/wp-content/uploads/flamingo3x.jpg 3x,
/wp-content/uploads/flamingo2x.jpg 2x,
/wp-content/uploads/flamingo1x.jpg 1x
"
src="/wp-content/uploads/flamingo-fallback.jpg"
>
Example
Frames
For websites to show silky smooth animation, the browser has to render at least 60 frames per second.
To render 60 fps, we have only 16.67 ms of time to process a frame. This involves parsing and executing javascript, rendering, compositing layers, and painting.
But the browser also has some house-keeping stuff to do (like monitor for network responses and coordinate with the OS). So realistically, we only have ~10 ms
Critical Rendering Path
Optimizing the critical rendering path refers to prioritizing the display of content that relates to the current user action.
The intermediate steps between receiving the HTML, CSS, and JavaScript bytes and the required processing to turn them into rendered pixels - that's the critical rendering path.
In Chrome DevTools, this phase is called "Parse HTML" phase.
The DOM tree captures the properties and relationships of the document markup, but it doesn't tell us how the element will look when rendered. That’s the responsibility of the CSSOM.
body { font-size: 16px }
p { font-weight: bold }
span { color: red }
p span { display: none }
img { float: right }
Why does the CSSOM have a tree structure?
When computing the final set of styles for any object on the page, the browser starts with the most general rule applicable to that node (for example, if it is a child of a body element, then all body styles apply) and then recursively refines the computed styles by applying more specific rules; that is, the rules "cascade down."
The CSSOM and DOM trees are combined into a render tree, which is then used to compute the layout of each visible element and serves as an input to the paint process that renders the pixels to screen.
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>Critial Path: Hello world!</title>
</head>
<body>
<div style="width: 50%">
<div style="width: 50%">Hello world!</div>
</div>
</body>
</html>
The output of the layout process is a "box model," which precisely captures the exact position and size of each element within the viewport: all of the relative measurements are converted to absolute pixels on the screen.
Now that we know which nodes are visible, and their computed styles and geometry, we can pass this information to the final stage, which converts each node in the render tree to actual pixels on the screen. This step is often referred to as "painting" or "rasterizing."
The "Layout" event captures the render tree construction, position, and size calculation in the Timeline.
When layout is complete, the browser issues "Paint Setup" and "Paint" events, which convert the render tree to pixels on the screen.
Here's a quick recap of the browser's steps:
Optimizing the critical rendering path is the process of minimizing the total amount of time spent performing steps 1 through 5 in the above sequence.
The above example, showing the NYTimes website with and without CSS, demonstrates why rendering is blocked until CSS is available---without CSS the page is relatively unusable. The experience on the right is often referred to as a "Flash of Unstyled Content" (FOUC).
CSS is a render blocking resource. Get it to the client as soon and as quickly as possible to optimize the time to first render.
<link href="style.css" rel="stylesheet">
<link href="print.css" rel="stylesheet" media="print">
<link href="other.css" rel="stylesheet" media="(min-width: 40em)">
<link href="portrait.css" rel="stylesheet" media="orientation:portrait">
However, what if we have some CSS styles that are only used under certain conditions, for example, when the page is being printed or being projected onto a large monitor?
JavaScript can also block DOM construction and delay when the page is rendered.
When the HTML parser encounters a script tag, it pauses its process of constructing the DOM and yields control to the JavaScript engine; after the JavaScript engine finishes running, the browser then picks up where it left off and resumes DOM construction.
Another subtle property of introducing scripts into our page is that they can read and modify not just the DOM, but also the CSSOM properties. The end result? We now have a race condition.
What if the browser hasn't finished downloading and building the CSSOM when we want to run our script?
The answer is simple and not very good for performance:
the browser delays script execution and DOM construction until it has finished downloading and constructing the CSSOM.
WITH ASYNC, IN THE HEAD
The script is fetched asynchronously, and when it’s ready the HTML parsing is paused to execute the script, then it’s resumed.
WITH DEFER, IN THE HEAD
The script is fetched asynchronously, and it’s executed only after the HTML parsing is done.
Defer scripts guarantees the order of execution in which they appear in the page.
The combination of the Navigation Timing API and other browser events emitted as the page loads allows you to capture and record the real-world CRP performance of any page.
Each of the labels in the above diagram corresponds to a high resolution timestamp that the browser tracks for each and every page it loads.
So, what do these timestamps mean?
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>Critical Path: No Style</title>
</head>
<body>
<p>Hello <span>web performance</span> students!</p>
<div><img src="awesome-photo.jpg"></div>
</body>
</html>
<!DOCTYPE html>
<html>
<head>
<title>Critical Path: Measure Script</title>
<meta name="viewport" content="width=device-width,initial-scale=1">
<link href="style.css" rel="stylesheet">
</head>
<body onload="measureCRP()">
<p>Hello <span>web performance</span> students!</p>
<div><img src="awesome-photo.jpg"></div>
<script src="timing.js"></script>
</body>
</html>
What happened?
The simplest possible page consists of just the HTML markup; no CSS, no JavaScript, or other types of resources.
Let's define the vocabulary we use to describe the critical rendering path:
if the CSS stylesheet were only needed for print, how would that look?
To deliver the fastest possible time to first render, we need to minimize three variables:
The number of critical resources.
The critical path length.
The number of critical bytes.
A critical resource is a resource that could block initial rendering of the page. The fewer of these resources, the less work for the browser, the CPU, and other resources.
Similarly, the critical path length is a function of the dependency graph between the critical resources and their bytesize:
some resource downloads can only be initiated after a previous resource has been processed, and the larger the resource the more roundtrips it takes to download.
Finally, the fewer critical bytes the browser has to download, the faster it can process content and render it visible on the screen.
To reduce the number of bytes, we can reduce the number of resources (eliminate them or make them non-critical) and ensure that we minimize the transfer size by compressing and optimizing each resource.
The general sequence of steps to optimize the critical rendering path is:
Minify, Compress, and Cache
Analyze and characterize your critical path: number of resources, bytes, length.
Minimize number of critical resources: eliminate them, defer their download, mark them as async, and so on.
Optimize the number of critical bytes to reduce the download time (number of roundtrips).
Optimize the order in which the remaining critical resources are loaded: download all critical assets as early as possible to shorten the critical path length.
If we're certain that a specific resource will be required in the future, then we can ask the browser to request that item and store it in the cache for reference later.
However, this is dependent on a number of conditions, as prefetching can be ignored by the browser.
For example, a client might abandon the request of a large font file on a slow network.
Firefox will only prefetch resources when "the browser is idle".
As developers, we know our applications better than the browser does. We can use this information to inform the browser about core resources.
That simple line will tell supportive browsers to start prefetching the DNS for that domain a fraction before it's actually needed.
This means that the DNS lookup process will already be underway by the time the browser hits the script element that actually requests the widget.
It just gives the browser a small head start.
<link rel="dns-prefetch" href="//example.com">
Much like the DNS prefetch method, preconnect will resolve the DNS but it will also make the TCP handshake, and optional TLS negotiation.
<link rel="preconnect" href="http://css-tricks.com">
This is the nuclear option, as prerender gives us the ability to preemptively load all of the assets of a certain document, like so:
<link rel="prerender" href="http://css-tricks.com">
This is like opening the URL in a hidden tab – all the resources are downloaded, the DOM is created, the page is laid out, the CSS is applied, the JavaScript is executed, etc.
If the user navigates to the specified href, then the hidden page is swapped into view making it appear to load instantly.
Preload
The preload value of the <link> element's rel attribute allows you to write declarative fetch requests in your HTML <head>, specifying resources that your pages will need very soon after loading, which you therefore want to start preloading early in the lifecycle of a page load, before the browser's main rendering machinery kicks in.
It allows you to force the browser to make a request for a resource without blocking the document’s onload event.
<head>
<meta charset="utf-8">
<title>JS and CSS preload example</title>
<link rel="preload" href="style.css" as="style">
<link rel="preload" href="main.js" as="script">
<link rel="stylesheet" href="style.css">
</head>
<body>
<h1>bouncing balls</h1>
<canvas></canvas>
<script src="main.js"></script>
</body>
Making a Frame
RAIL is a user-centric performance model that breaks down the user's experience into key actions.
Every web app has four distinct aspects to its life cycle, and performance fits into them in different ways:
In the context of RAIL, the terms goals and guidelines have specific meanings:
Users experience a slight perceptible delay.
Beyond 1000 milliseconds (1 second), users lose focus on the task they are performing.
Keep in mind that although your typical mobile user's device might claim that it's on a 2G, 3G, or 4G connection, in reality the effective connection speed is often significantly slower, due to packet loss and network variance.
Goal: Maximize idle time to increase the odds that the page responds to user input within 50ms.
Goals:
Technically, the maximum budget for each frame is 16ms (1000ms / 60 frames per second ≈ 16ms)
Guidelines:
In high pressure points like animations, the key is to do nothing where you can, and the absolute minimum where you can't.
Also animations
Goal: Complete a transition initiated by user input within 100ms.
Guidelines:
Goals:
Guidelines
Demo
Optimizing JavaScript
What is tree shaking?
// Import all the array utilities!
import arrayUtils from "array-utils";
// Import only some of the utilities!
import { unique, implode, explode } from "array-utils";
Keeping Babel from transpiling ES6 modules to CommonJS modules
{
"presets": [
["env", {
"modules": false
}]
]
}
Keeping side effects in mind
let fruits = ["apple", "orange", "pear"];
console.log(fruits); // (3) ["apple", "orange", "pear"]
const addFruit = function(fruit) {
fruits.push(fruit);
};
addFruit("kiwi");
console.log(fruits); // (4) ["apple", "orange", "pear", "kiwi"]
Optimize JavaScript Execution
Use requestAnimationFrame for visual changes
Reduce complexity or use Web Workers
Reduce the Scope and Complexity of Style Calculations
Changing the DOM, through adding and removing elements, changing attributes, classes, or through animation, will all cause the browser to recalculate element styles and, in many cases, layout (or reflow) the page, or parts of it.
.box:nth-last-child(-n+1) .title {
/* styles */
}
.final-box-title {
/* styles */
}
Reduce the complexity of your selectors; use a class-centric methodology like BEM.
is this an element with a class of title which has a parent who happens to be the minus nth child plus 1 element with a class of box?
Avoid layout wherever possible
Changes to “geometric properties”, such as widths, heights, left, or top all require layout.
Layout is almost always scoped to the entire document. If you have a lot of elements, it’s going to take a long time to figure out the locations and dimensions of them all.
Use flexbox over older layout models
If we update the sample to use Flexbox, a more recent addition to the web platform, we get a different picture:
Avoid forced synchronous layouts
// Schedule our function to run at the start of the frame.
requestAnimationFrame(logBoxHeight);
function logBoxHeight() {
// Gets the height of the box in pixels and logs it out.
console.log(box.offsetHeight);
}
function logBoxHeight() {
box.classList.add('super-big');
// Gets the height of the box in pixels
// and logs it out.
console.log(box.offsetHeight);
}
Avoid layout thrashing
There’s a way to make forced synchronous layouts even worse: do lots of them in quick succession. Take a look at this code:
function resizeAllParagraphsToMatchBlockWidth() {
// Puts the browser into a read-write-read-write cycle.
for (var i = 0; i < paragraphs.length; i++) {
paragraphs[i].style.width = box.offsetWidth + 'px';
}
}
var width = box.offsetWidth;
function resizeAllParagraphsToMatchBlockWidth() {
for (var i = 0; i < paragraphs.length; i++) {
// Now write.
paragraphs[i].style.width = width + 'px';
}
}
In a web page, every character of that content, structure, formatting, and behavior must be fetched from the server and downloaded to the browser, a decidedly non-trivial task.
Separate Development from Deployment
Always keep development and deployment files separate to avoid replacing a development file with a deployment version.
Minify Your Code
HTML, CSS and JS all can be minified.
Frameworks
Use task runners such as gulp or module bundler such as webpack to automate the task
Compress Text Resources
Gzip performs best on text resources, and can regularly achieve up to 70% compression
Already-compressed content: Most images, music and videos are already compressed. Don’t waste time compressing them again.
Reduce Library Use
jQuery, for example -- depending on the version and the compression algorithms applied -- might range from 28k to over 250k.
HTTPArchive notes that images account for approximately 50% of the average web site's content.
Remove Unnecessary Images
It's a question that apparently isn't asked enough
Choose Appropriate Image Types
As a rule of thumb, use PNGs for clip art, line drawings, or wherever you need transparency, JPGs for photographs, and GIFs when you need animation.
Remove Image Metadata
Can lead to size decrease upto 10%.
Resize Images
Size images based on their intended use
All your images should be appropriately sized for their intended use and should not rely on the browser to resize them for rendering.
For example, you might have a 1200 x 600 pixel image that you present at 60 x 30 (as a 5% thumbnail) and roll it up to full size on hover using a CSS transition.
It works fine and looks great, but if the user never actually hovers over the thumbnail, then 95% of the time that image took to download was wasted.
Crop images to show only what's important
omitting parts of the image that aren't important to information delivery.
Reduce image quality
burger80.jpg -- 80%, 60k
burger25.jpg -- 25%, 20k
Compress Images
PNG and JPG images can be squashed down even more using compression tools.
Vector vs. Raster images
Implications of high-resolution screens
Screen resolution | Total pixels | Uncompressed filesize (4 bytes per pixel) |
---|---|---|
1x | 100 x 100 = 10,000 | 40,000 bytes |
2x | 100 x 100 x 4 = 40,000 | 160,000 bytes |
3x | 100 x 100 x 9 = 90,000 | 360,000 bytes |
When we double the resolution of the physical screen, the total number of pixels increases by a factor of four: double the number of horizontal pixels, times double the number of vertical pixels.
Delivering HiDPI images using srcset
The Device Pixel Ratio (DPR) (also called the "CSS pixel ratio") determines how a device’s screen resolution is interpreted by CSS.
<img srcset="paul-irish-320w.jpg,
paul-irish-640w.jpg 2x,
paul-irish-960w.jpg 3x"
src="paul-irish-960w.jpg" alt="Paul Irish cameo">
Optimizing vector images
Automating image optimization
Most CDNs (e.g Akamai) and third-party solutions like Cloudinary, imgix, Fastly's Image Optimizer, Instart Logic's SmartVision or ImageOptim API offer comprehensive automated image optimization solutions.
Everyone should be compressing their images efficiently.
At minimum: use ImageOptim. It can significantly reduce the size of images while preserving visual quality.
JPEG compression modes
Three popular modes are baseline (sequential), Progressive JPEG (PJPEG) and lossless.
Baseline JPEGs (the default for most image editing and optimization tools) are encoded and decoded in a relatively simple manner: top to bottom.
Lossless JPEGs are similar but have a smaller compression ratio.
Baseline JPEGs load top to bottom while Progressive JPEGs load from blurry to sharp.
The advantages of Progressive JPEGs
The ability for PJPEGs to offer low-resolution "previews" of an image as it loads improves perceived performance - users can feel like the image is loading faster compared to adaptive images.
On slower 3G connections, this allows users to see (roughly) what's in an image when only part of the file has been received and make a call on whether to wait for it to fully load.
The disadvantages of Progressive JPEGs
PJPEGs can be slower to decode than baseline JPEGs - sometimes taking 3x as long.
Progressive JPEGs are also not always smaller. For very small images (like thumbnails), progressive JPEGs can be larger than their baseline counterparts.
How do you create Progressive JPEGs?
Tools and libraries like ImageMagick, libjpeg, jpegtran, jpeg-recompress and imagemin support exporting Progressive JPEGs.
Chroma (or color) subsampling
As contrast is responsible for forming shapes that we see in an image, luma, which defines it, is pretty important.
compress-or-die.com recommends sticking with a subsampling of 4:4:4 (1x1) when working with images containing text.
Compressing Animated GIFs and why <video> is better
Why are GIFs many times larger?
Use ffmpeg to convert your animated GIFs (or sources) to H.264 MP4s.
Consider a lossy GIF encoder. The Giflossy fork of Gifsicle supports this with the —lossy flag and can shave ~60-65% off size.
Avoid recompressing images with lossy codecs
Lazy-load non-critical images
Lazy loading is a web performance pattern that delays the loading of images in the browser until the user needs to see it.
Images that must appear "above the fold," or when the web page first appears are loaded straight away.
Why is Lazy Loading Useful?
Caveats
How Can I Apply Lazy Loading to My Pages?
I recommend lazysizes by Alexander Farkas because of its decent performance, features, its optional integration with Intersection Observer, and support for plugins.
Avoiding the display:none trap
<img src="img.jpg">
<style>
@media (max-width: 640px) {
img {
display: none;
}
}
</style>
A quick check against the Chrome DevTools network panel will verify that images hidden using these approaches still get fetched, even when we expect them not to be.
Again, where possible, use <picture> and <img srcset> instead of relying on display:none.
Memory Cache
Service Worker Cache
HTTP Caching ("Disk Cache")
Cache Headers
Two main types of cache headers, cache-control and expires, define the caching characteristics for your resources.
<filesMatch ".(ico|jpg|jpeg|png|gif)$">
Header set Cache-Control "max-age=2592000, public"
</filesMatch>
<filesMatch ".(css|js)$">
Header set Cache-Control "max-age=86400, public"
</filesMatch>
Expires Caching
You can also enable caching by specifying expiration, or expiry, times for certain types of files, which tell browsers how long to use a cached resource before requesting a fresh copy from the server.
Tip: Don't use an expiry greater than one year; that's effectively forever on the internet and, as noted above, is the maximum value for max-age under cache-control.
## EXPIRES CACHING ##
ExpiresActive On
ExpiresByType image/jpg "access plus 1 year"
ExpiresByType image/jpeg "access plus 1 year"
ExpiresByType text/x-javascript "access plus 1 month"
ExpiresByType application/x-shockwave-flash "access plus 1 month"
ExpiresByType image/x-icon "access plus 1 year"
ExpiresDefault "access plus 2 days"
## EXPIRES CACHING ##
Pattern 1: Immutable content + long max-age
Cache-Control: max-age=31536000
<script src="/script-f93bca2c.js"></script>
<link rel="stylesheet" href="/styles-a837cb1e.css">
<img src="/cats-0e9a2ef4.jpg" alt="…">
However, this pattern doesn't work for things like articles & blog posts. Their URLs cannot be versioned and their content must be able to change.
Pattern 2: Mutable content, always server-revalidated
Cache-Control: no-cache
Etag
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4" ETag: W/"0815"
Avoiding mid-air collisions
max-age on mutable content is often the wrong choice
A refresh sometimes fixes it
If the page is loaded as part of a refresh, browsers will always revalidate with the server, ignoring max-age.
A service worker can extend the life of these bugs
const version = '2';
self.addEventListener('install', event => {
event.waitUntil(
caches.open(`static-${version}`)
.then(cache => cache.addAll([
'/styles.css',
'/script.js'
]))
);
});
self.addEventListener('activate', event => {
// …delete old caches…
});
self.addEventListener('fetch', event => {
event.respondWith(
caches.match(event.request)
.then(response => response || fetch(event.request))
);
});
Caching checklist