Wednesday 11 January 2012

Writing Quality Third-Party JS - Part 2: Loading Your Code

In the previous post Writing Quality Third-Party JS - Part 1: The First Rule, I wrote about the fact that you don't own the page, and how that affects even little things in your code. With that background, this post focuses on how to load your code in the host page.

This is a three-part series of articles aimed at people who want to write JavaScript widgets (or other kinds of scripts) to make their application's data/services/UI available on other websites. This is the second in the series, tackling the topic of loading your code in the host page.

Bootstrapping

Remember how in my last post I kept harping on that you don't own the page? It applies here too. Two factors to consider are:

  • Can the network performance effects of downloading your code be eliminated completely?
  • If your servers ever go down, will it affect the host page at all?

What's wrong with a <script> tag?

I won't go into too much detail here since it has been written about in enough detail elsewhere, but just to quickly summarize: a script tag will block download of resources on your page AND will block the rendering of the page. It's what Steve Souders calls the Frontend Single Point of Failure. You don't want to be a point of failure.

It surprises me then, that Twitter not only requires you to include a vanilla script tag, but also that you should put it in the <head> of your page. This is despite it being common knowledge at the time of the launch of their @Anywhere APIs that this is a bad practice. In fact, on the contrary, they say in their docs:

As a best practice always place the anywhere.js file as close to the top of the page as possible.

Now, I don't mean to make them look bad — they are smart people and know what they are doing. What I'm saying is that I can't think of any reason to do this. They claim that it's for a better experience with OAuth 2.0 authentication, but I'm not convinced. I'd recommend that you do not use Twitter's model as an example for how to load your script. There are much better mechanisms available.

To be fair, even Google Analytics was doing something very similar until recently, but at least they asked for the script tag to be before the closing </body> tag. They've since deprecated this technique anyway.

Eliminating the performance hit

Realize that my first point above isn't about reducing the performance cost, but about eliminating it. Techniques for reducing the performance hit are already well known — caching and expiry, CDNs, etc. That said, how much ever you reduce the performance hit, there is still a performance hit anyway. How can this be eliminated?

async FTW?

HTML(5) introduced the async attribute, which instructs the browser's parser-renderer to load the file asynchronously, while continuing to render the page as usual. This is a life-saver, but unfortunately not good enough for use just yet.

Why isn't it good enough? Because it isn't supported in older browsers, including IE9 and lower. Considering that IE is a large part of the browser market, and as of right now IE10 isn't in lay-people's hands yet, you will need to do better than slap on an async attribute on your script tag. The async attribute is amazing, just not ready yet.

Dynamic script tag creation

Another way to load your code is to create a script tag dynamically. Nicholas Zakas has gone into details about this technique in a blog post, so I won't mention it all here. I'll just show his code here instead:

Seems simple enough. He uses DOM methods to create a script tag, and then appends it to the page. As Souders has explained, this technique causes the script to be downloaded immediately, but asynchronously. This is usually what you want. In fact, this is the approach Facebook takes to load their JavaScript SDK, though they append to the head instead of the body, which also has the same effect.

There's a minor improvement that has been common knowledge since some time now. Remember in my previous post I said that you cannot make any assumptions about the page? The snippets above assume that document.body or document.getElementsByTagName("head")[0] reliably exists for use. Turns out, that assumption might be wrong. So, the minor improvement is to not depend on document.body or the head. Instead, only assume that at least one script tag exists on the page. This assumption is always right, since that's how your code is running in the first place. So, instead of appending to document.body, do what Google does:

The code inside the self-executing anonymous function immediately invoked function (hat-tip) creates the script tag, and then appends it as a sibling of the first script tag on the page. It doesn't matter which the first script tag is — we rely on the fact that a script tag surely exists on the page. We also rely on the fact that the script tag has a parentNode. Turns out, this is a safe assumption to make.

Both these scripts set script.async = true;. I'm unsure why they do this. Firefox's documentation implies that this is not necessary in Firefox, IE, or Webkit:

In Firefox 4.0, the async DOM property defaults to true for script-created scripts, so the default behavior matches the behavior of IE and WebKit.

I guess there's no harm done in setting the async flag to true, so everyone just does it. Either that, or I simply don't know. As pointed by Steve Souders in a comment below, async=true is required for some edge-case browsers, including Firefox 3.0.

But when does the download happen?

All browsers that implement async start the download immediately and asynchronously. This means that your script will be parsed and executed at some time in the future. The question I had at this point was: At what point in the page load cycle does my script get evaluated? Does onload get blocked? This addresses my second point at the top of the post. There might be situations when my script is unreachable due to either server or network problems, and I didn't want Errorception to affect any other scripts on the page in such a situation.

So, I ran quick tests to find the answer, and unsurprisingly the answer is you can't be sure. Just today, Steve Souders published a browser-scope test page that tests just this behavior. It looks like older implementations of async were indeed holding back page load events. This seems to be getting phased out in newer versions of browsers. If I were writing the tracking snippet today, I would probably have used the technique mentioned above.

Instead, I decided to do slightly better than these techniques while coding the Errorception tracking script. I decided that I'll explicitly wait for the page onload to fire first, and only then include the Errorception tracking script. The tracking snippet in Errorception's case looks as follows:

The snippet above completely gets out of the way during the page load process, hence ensuring that we do not depend on browser behavior of event firing sequences when using the async attribute. It has an additional minor benefit that the end user's bandwidth is completely available for loading up the site's intended resources first before loading the Errorception tracking code. Effectively, it prioritizes the site's code and content over Errorception's own code. This fits in perfectly with Errorception's philosophy of not affecting page load time at all. Depending on the type of connection the end-user has, this bandwidth-prioritization technique might have merits.

I'll be the first to admit that what Errorception is doing might not be for everyone. Take Facebook for example. If you are including their API, chances are, you want to do something with it. Delaying the loading will only deteriorate the end-user experience. However, in Errorception's case, there's no such interaction that will ever block, so this works out fine.

I initially thought I'll discuss about APIs as well in this post, but the post has seemed to run too long already. I'll save that bit for the next post.

Errorception is a ridiculously easy-to-use JavaScript error tracking system, designed for high performance and high reliability. There's a free trial so you can give it a spin. Find out more »

In Part 3…

In the next post in the series, I'll discuss mechanisms to signal the availability of your APIs to your developers considering that your code is downloaded at some arbitrary time in the future, and methods of communicating with your server by bypassing the same-origin policy.

4 comments:

  1. async = true is necessary in FF < 4.0. Support for it was added in 3.6.

    ReplyDelete
  2. You want to set async=true for some edge case browsers (including Firefox 3). Those browsers are found in the results from yesterday's Browserscope test ( http://www.browserscope.org/user/tests/table/agt1YS1wcm9maWxlcnINCxIEVGVzdBjrq90MDA?v=3&layout=simple&highlight=1 ).

    I think you have to use onreadystatechange for IE (in addition to onload for other browsers).

    You might also want to look at ControlJS ( http://controljs.org/ ).

    ReplyDelete
  3. Thanks for the clarification, Steve. Updated my post above to reflect this.

    ReplyDelete
  4. Twitter suggests loading their script in the head because they inject the real stuff asynchronously, so you can often gain extra performance by downloading their stuff concurrently with yours. That occurs with the line of code in anywhere.js

    document.body.insertBefore(b,document.body.firstChild);

    The only bad part about their implementation is the assumption of document.body, but it's likely faster if you include this script in the head, not at the bottom. The best advice is to always test, and not assume anything.

    ReplyDelete