June 26, 2013
This is one of the graphs that you prefer to watch go down and to the right: median site load time over the last three months.
Page load time has been shown to correlate directly to user frustration; users are more likely to leave after the first visit and never come back as a site loads slower. With the benefit of great tooling and advice, we’ve been able to cut our site load time by about 25% and create a clear vision of how to continue on the path to an ever-faster website.
Tooling for Identifying and Tracking
Two types of tools are essential for our performance testing: the tools to track what makes page loads slow, and the tools to log historical page load data and correlate it against code changes.
Your favorite browser will have dev tools that show a list of network requests, known as a “waterfall”. Chrome’s looks something like this:
The waterfall will also show you information about asset size, load order, and what assets are blocking rendering and downloading. You can also use Chrome’s dev tools to watch the headers of individual requests to check for large cookie sizes and gzip compression. A related resource for performance suggestions will come from installing and running PageSpeed Insights, which will point out which assets are too large, uncompressed, uncached, loaded improperly, or otherwise unoptimized.
Another fantastic tool for finding performance issues is the free WebPagetest. It allows you to test site performance from a variety of worldwide locations, and provides a ton of information in its output. For example, here’s a recent test of the Airbnb homepage. We can select the median result, then look at what assets are loaded and when page rendering begins and ends. It has several comparison views, and will even export a video. The visual aid it provides may show that although time-to-pageload is low, visual completion might come significantly later.
On the server, we use New Relic, which will highlight server inefficiencies and allow you to track down things like cache misses. It also has some tremendously useful information about performance in general, such as which pages users spend the most volume of time loading, so you know where to invest your optimization efforts. You can also watch as deploys go out and compare the current average load time against prior time periods to look for both regressions and the results of your optimizations.
Lastly, we use a RUM tracker to watch performance trends in conjunction with historical WebPagetest data. That RUM data is compiled using episodes.js and sent to a Hive cluster, which we then query for data based on page, location, or composite medians and averages.
Acting Upon this Wealth of Information
$(function) handler, but rather with
$(window).on('load', function); where possible. This allows the DOM to render and higher-priority scripts to execute while things like hidden gallery images are loaded.)
We also cache non-personalized pages aggressively with Akamai. For example, a recent change to begin caching the homepage has dropped load time for users by several hundred milliseconds:
With this approach, we can reduce the time to load the page by as much as possible by allowing users to load directly from edge CDN servers close to them. If you’re using Dyn DNS or Amazon’s Route 53, you can experiment with sending users to different CDNs based on latency or geography.
We have our work cut out for us. Next we’ll be working on:
- SPDY integration
- HTTP streaming; possible in Rails 3.1+, Node, and many other frameworks
- Smarter edge caching with better global distribution
- Continuing to go page-by-page with Chrome’s dev tools and WebPagetest
Special thanks to Steve Souders for the fantastic tech talk he gave at our office in March, and for the advice on how to make our pages faster.
Be sure to check out his tech talk here: Dive into Performance