Performance Tuning

About Jack Lawson

This is one of the graphs that you prefer to watch go down and to the right: median site load time over the last three months.

global-page-load

Page load time has been shown to correlate directly to user frustration; users are more likely to leave after the first visit and never come back as a site loads slower. With the benefit of great tooling and advice, we’ve been able to cut our site load time by about 25% and create a clear vision of how to continue on the path to an ever-faster website.

Tooling for Identifying and Tracking

Two types of tools are essential for our performance testing: the tools to track what makes page loads slow, and the tools to log historical page load data and correlate it against code changes.

The first place to start is with a network waterfall graph. Chances are, unless you have a particularly fast website already, optimizing the number, order, and size of your assets will net you the largest gain. In all likelihood, the 700ms it takes your browser to download a blocking JavaScript file will outweigh any server change you could make by an order of magnitude, and will be easier to fix. If your goal is to drive down page speed, this is a good place to start.

Your favorite browser will have dev tools that show a list of network requests, known as a “waterfall”. Chrome’s looks something like this:

chrome-dev-tools

The waterfall will also show you information about asset size, load order, and what assets are blocking rendering and downloading. You can also use Chrome’s dev tools to watch the headers of individual requests to check for large cookie sizes and gzip compression. A related resource for performance suggestions will come from installing and running PageSpeed Insights, which will point out which assets are too large, uncompressed, uncached, loaded improperly, or otherwise unoptimized.

page-speed

Another fantastic tool for finding performance issues is the free WebPagetest. It allows you to test site performance from a variety of worldwide locations, and provides a ton of information in its output. For example, here’s a recent test of the Airbnb homepage. We can select the median result, then look at what assets are loaded and when page rendering begins and ends. It has several comparison views, and will even export a video. The visual aid it provides may show that although time-to-pageload is low, visual completion might come significantly later.

On the server, we use New Relic, which will highlight server inefficiencies and allow you to track down things like cache misses. It also has some tremendously useful information about performance in general, such as which pages users spend the most volume of time loading, so you know where to invest your optimization efforts. You can also watch as deploys go out and compare the current average load time against prior time periods to look for both regressions and the results of your optimizations.

Lastly, we use a RUM tracker to watch performance trends in conjunction with historical WebPagetest data. That RUM data is compiled using episodes.js and sent to a Hive cluster, which we then query for data based on page, location, or composite medians and averages.

Acting Upon this Wealth of Information

Following the list of PageSpeed Insights suggestions will get you most of the way there. In our experience, most gains have been had from the simplest of rules: “reduce requests.” Sprite images in stylesheets, combine and minify multiple JavaScript files, and embed small images into data URIs; watch the dev tool waterfalls and make sure that you don’t have assets blocking the download of other assets; and finally, put JavaScript at the bottom of the page. An extension of that is to lazy-load content; for example, on our homepage, additional gallery images are loaded after page load. (Note that lazy-loading is done not with jQuery’s $(function) handler, but rather with $(window).on('load', function); where possible. This allows the DOM to render and higher-priority scripts to execute while things like hidden gallery images are loaded.)

Because we use Rails, we get automatic JavaScript and CSS combining / minifying through Sprockets. Sprockets also provides data-uri helpers, so you can embed small images with ease. You can use tools like browserify and Grunt in Node, or equivalents in your environment of choice. Google has also released a PageSpeed Insights plugin for nginx and Apache that will apply some of these rules for you automatically.

We also cache non-personalized pages aggressively with Akamai. For example, a recent change to begin caching the homepage has dropped load time for users by several hundred milliseconds:

p1-caching

With this approach, we can reduce the time to load the page by as much as possible by allowing users to load directly from edge CDN servers close to them. If you’re using Dyn DNS or Amazon’s Route 53, you can experiment with sending users to different CDNs based on latency or geography.

We use domain sharding across three domains for static assets: images, CSS, JavaScript, and web fonts, from cookieless domains. This allows users to download multiple assets at once. Once we have SPDY enabled, however, we’ll reduce the domains to a single asset domain in order to take advantage of SPDY’s HTTP pipelining.

Future Steps

We have our work cut out for us. Next we’ll be working on:

  • SPDY integration
  • HTTP streaming; possible in Rails 3.1+, Node, and many other frameworks
  • Cutting down bundled asset size; removing old JavaScript and CSS, and relying more heavily on our style platform
  • Smarter edge caching with better global distribution
  • Continuing to go page-by-page with Chrome’s dev tools and WebPagetest

Special thanks to Steve Souders for the fantastic tech talk he gave at our office in March, and for the advice on how to make our pages faster.

Be sure to check out his tech talk here: Dive into Performance

8 comments

About Jack Lawson

Speak Your Mind

*

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

Comments

  1. Jennifer Showe

    Congrats and nice walk through! It’s really amazing how much can be accomplished even with free tools (mostly) available for profiling and testing. Glad to see the note about moving away from domain sharding though. There’s a good explanation of why that can be particularly bad for mobile in @guypod slides here: http://www.slideshare.net/guypod/unravelling-mobile-web-performance

    Cheers!

  2. Jim Haughwout

    I love these down-and-to-right graphs. Not enough people realize how much they bring traffic/revenue/etc up-and-to-the-right. Thanks for sharing

  3. Perry Huang

    The memcached monitor plugin for New Relic does not really provide much information about cache misses if you’re trying to really dig deep into use of cache servers. It just polls general stats that memcached spits out. I’m releasing a simple tool soon that can analyze (either live or offline) memcached packets to tell the hit/miss rate, value size stats, and raw # of get/set/other operations, and more for specific groups of keys. It’ll be useful for very heavy users of memcached so that they can track cache use errors in large code bases (such as never setting a specific group of key after a cache miss occurs).

  4. hernan

    the web framework “perl catalyst”, generates debug output on the fly for each ‘url’ you access and that makes easy to debug everything that your application is doing… ie:
    - how much each method takes to finish…
    - how many methods it calls…
    and you can easily identify where its taking too much time

    Here is the output example:

    $ script/myapp_server.pl -r
    [debug] Debug messages enabled
    [debug] Statistics enabled
    [debug] Loaded plugins:
    .—————————————————————————-.
    | Catalyst::Plugin::ConfigLoader 0.30 |
    | Catalyst::Plugin::StackTrace 0.11 |
    ‘—————————————————————————-’

    [debug] Loaded dispatcher "Catalyst::Dispatcher"
    [debug] Loaded engine "Catalyst::Engine"
    [debug] Found home "/home/catalyst/MyApp"
    [debug] Loaded Config "/home/catalyst/MyApp/myapp.conf"
    [debug] Loaded components:
    .-----------------------------------------------------------------+----------.
    | Class | Type |
    +-----------------------------------------------------------------+----------+
    | MyApp::Controller::Books | instance |
    | MyApp::Controller::Root | instance |
    | MyApp::Model::DB | instance |
    | MyApp::Model::DB::Author | class |
    | MyApp::Model::DB::Book | class |
    | MyApp::Model::DB::BookAuthor | class |
    | MyApp::View::HTML | instance |
    '-----------------------------------------------------------------+----------'

    [debug] Loaded Private actions:
    .----------------------+--------------------------------------+--------------.
    | Private | Class | Method |
    +----------------------+--------------------------------------+--------------+
    | /default | MyApp::Controller::Root | default |
    | /end | MyApp::Controller::Root | end |
    | /index | MyApp::Controller::Root | index |
    | /books/index | MyApp::Controller::Books | index |
    | /books/list | MyApp::Controller::Books | list |
    '----------------------+--------------------------------------+--------------'

    [debug] Loaded Path actions:
    .-------------------------------------+--------------------------------------.
    | Path | Private |
    +-------------------------------------+--------------------------------------+
    | / | /default |
    | / | /index |
    | /books | /books/index |
    | /books/list | /books/list |
    '-------------------------------------+--------------------------------------'

    [info] MyApp powered by Catalyst 5.80020
    HTTP::Server::PSGI: Accepting connections at http://0:3000

    [info] WWW::Local::Webserver powered by Catalyst 5.90020
    HTTP::Server::PSGI: Accepting connections at http://0:15000/
    [info] *** Request 1 (0.001/s) [2732] [Wed Jun 26 21:52:28 2013] ***
    [debug] Path is “/”
    [debug] “GET” request for “/” from “127.0.0.1″
    [debug] Response Code: 200; Content-Type: text/html; charset=utf-8; Content-Length: 5521
    [info] Request took 0.015545s (64.329/s)
    .————————————————————+———–.
    | Action | Time |
    +————————————————————+———–+
    | /index | 0.000151s |
    | /end | 0.000326s |
    ‘————————————————————+———–’

  5. Will Moss

    Glad to hear about the performance gains. Here’s another set of slides on the subject of optimization I’ve found interesting:

    http://www.igvita.com/slides/2013/fluent-perfcourse.pdf

  6. Mike Schoeffler

    Sweet improvements in your site load times – congratulations! Your users may never thank you with words, but they’ll express their appreciation in more bookings.

    I’m calculating the value side of performance speedups for different websites and pages within sites (“how much do we make if this page speeds up 50%?”).

    I’d love to get more detail on the “getting it done” side.

    How much engineering effort was involved in the lowest hanging fruit (planning, execution, future maintenance)?
    How hard do you think the next round will be?
    Can you dig even deeper if it’s worth it?

  7. Steve Clay

    What exactly is the measurement in the first graph? How do you define “load time”?

  8. jason

    How do you guys handle load balancing and fault tolerance? In case one of the webservers goes down, etc.

    Thanks