Monday 9 January 2012

Writing Quality Third-Party JS - Part 1: The First Rule

It's fascinating how JavaScript has quickly become to de-facto mechanism to deliver third-party integrations that are easily pluggable into people's websites. Services like Facebook, Twitter and Disqus, programmable widgets like Google Maps, and even invisible scripts like Google Analytics, KissMetrics and our own Errorception give you ways to integrate their services with your website.

You'd think that with these mechanisms becoming so popular, there'd be a good deal of information available about how to build great third-party integrations in JavaScript. Turns out, the information available on the Interwebs is actually rather sparse. This series of posts aims to add to that repository of knowledge, based on my experience building Errorception.

This is a three-part series of articles aimed at people who want to write JavaScript widgets (or other kinds of scripts) to make their application's data/services/UI available on other websites. This is the first in the series, highlighting important considerations.

The First Rule

The First Rule of Third-Party JavaScript is... man, this will never sound as epic as Tyler Durden. Anyway, here's the first rule, and the most important consideration:

The Rule: You DO NOT Own The Page

Understanding and assimilating this rule gives you two principle considerations when designing your script.

  • The impact of adding your script should be minimal. Preferably none.
  • You cannot make any assumptions about how the page is coded.

Let's start with the first point. How can you make the impact of your script minimal? A couple of considerations come to mind.

No globals

You should ideally have no global variables in your code. Making sure this happens is rather simple. Firstly, ensure that you enclose your code in a self-executing anonymous function. Secondly, pass your code through a good lint tool to ensure that you've not used any undeclared variables, since undeclared variables will cause implicit global variables. This is also regarded as a general best-practice for JS development, and there's absolutely no reason you shouldn't adhere to it.

A typical self-executing anonymous function looks as follows, in its most minimal form:

To do this slightly better, I recommend the following form instead:

In the pattern above, the window and document objects — both rather frequently used — become local-scope variables. Local variables are usually reduced to one or two letter variable names when passed through the most popular JS minifiers, so the size of your code reduces somewhat by using this pattern. I'll come back to the undefined variable in just a bit. Once minified, your code will look something like the following (using closure-compiler in this case). Notice how window and document have been reduced to single letter variable names.

Wait, what? No globals?

Ok, there are some cases when a global variable is absolutely necessary. Since linkage in JS happens through global variables, it might be necessary to add a global variable just to provide an API namespace to your users. Here's what the most popular third-party snippets do:

  • Google Analytics exposes a _gaq variable (docs) with one method on it — push.
  • Facebook exposes a FB variable (docs), which is the namespace for their API. (Using FB also requires you to define a global fbAsyncInit function, which could've been avoided. I guess they're doing what they do for ease of use, even though it's against best practice.)
  • Twitter @Anywhere exposes a twttr variable (docs), which like FB is their API's container namespace.
  • For completeness, Errorception exposes a _errs variable. We currently do not have an API (coming soon!), but this has been left as a placeholder.

Take a moment to think about those variable names. They are pretty unique to the service-provider, and will usually not conflict. The possibility of conflict cannot be completely avoided, but can be reduced significantly by picking one that's unique to your service. So, exporting a global of $, $$ or _ is just a horrible idea, however easy-to-type it may seem.

No modifications to shared objects

Several objects are shared in the JS runtime — even more than you might imagine at first. For example, the DOM is shared, but then so are the String and Number objects. Do not modify these objects either by altering their instances directly or by modifying their prototypes. That's simply not cool.

The only cautious exception to this rule might be if you need to polyfill browser methods. In all honesty, I would avoid using polyfills as much as possible in my third-party code and I don't recommend it at all, but you could do this if you are feeling adventurous. The reason I wouldn't recommend it is two-fold:

  • The code on the page may make assumptions about browser capabilities based on browser detection rather than feature detection. (Remember, even jQuery removed browser detection only recently, and many popular libraries still do a good deal of browser detection.)
  • The code on the page might do object enumeration for...in instead of array iteration for(var i=0;i<len;i++) for iterating through arrays. Even when enumerating object properties (which is a legit case of for...in), the code on the page might not use hasOwnProperty.

Either of these will break the code on the host page. You don't want code on the host page to break just because they've added your script.

No DOM modifications

Just like you don't own the global namespace, you don't own the DOM either. Making any changes to the DOM is simply unacceptable. Do not add properties or attributes to the DOM, and do not add or remove elements in the DOM. Your widget is never important enough to add extra nodes or attributes to any element of the DOM. This is because code on the page might be too tightly dependent on the DOM being one way, and if you modify it their code is likely to break.

That said, there are two permissible cases when you can modify the DOM. The first is when you are expected to present a UI, and the second is when you need to communicate with your server and circumvent the same-origin policy of the browser. In the first case, make the contract explicit by asking for a DOM node within which you will make the modifications, so that the developer of the page knows what to expect. I'll address the second case in detail in the third post in this series.

Make no assumptions about the page

This is the most complex to ensure. Even Google Analytics has had issues with their tracking script because of assumptions they inadvertently made. Steve Souders has enumerated some on his blog. So, for example, you can't even assume that the DOM will have a <head> node even after parsing the HTML! jQuery also had some bugs in their dynamic loader due to assumptions they inadvertently made. It seems that the only real thing you can rely on is that your script tag is on the page (and hence in the DOM). Nothing else should be taken for granted.

Unfortunately, I don't have a clean solution for this problem. The only solution seems to be that you should keep your dependencies on the DOM and to native object to a bare minimum, and test in every environment you can lay your hands on. In Errorception, I test on way more browsers than I'd care about, even including old browser versions and mobile phones. It's the only real way to be sure. I have to do this irrespective of whether I support the browser or not, because it might be perfectly supported by the developers of the page.

undefined redefined

A slightly less scary but equally dangerous problem is that undefined is not a keyword or literal in JavaScript. It really should have been. Since it's not a keyword, it's possible for someone to create a global variable called undefined, and that can mess with your script. If you really need to use undefined, you should ensure that you have a clean, unassigned undefined for your use. There are several ways to make sure you are working with a clean undefined. The anonymous function I've shown above implements one of these mechanisms, such that undefined inside the function is the undefined you expect.

Trust

Before closing this post, I want to touch upon the issue of trust. By adding your script to a page, the developer is knowingly or unknowingly placing a lot of trust in you. I completely dislike this, but unfortunately JavaScript has no built in mechanism to reduce the problems of trust. Several projects exist to reduce the possibilities of vulnerabilities, but they seem too heavy to use. Douglas Crockford has been trying to educate people about the issues, but it seems to be mostly falling on deaf ears.

One post in particular is relevant here: an excellent post by Philip Tellis titled "How much do you trust third-party widgets?". It's a must read to get a gist of the issues surrounding trust in third-party widgets, along with some high-level solutions.

In Part 2…

In the next installment of this article series, I'll talk about strategies to load and bootstrap your code in the page, without causing a performance hit. I'll also be touching upon how, at Errorception, we mitigate the risk of the service going down, if ever — an approach that's definitely not unique to Errorception, but isn't as abundantly used as you think.

13 comments:

  1. Nice post!

    Keep them coming!

    ReplyDelete
  2. Good post overall, but wanted to knit pick over: "you enclose your code in a self-executing anonymous function"

    It's not a self-executing function. Saying that it is a self-executing function suggests that recursion is happening. However, the function never actually executes itself, and therefore it is incorrect to say that it is 'self-executing'. Instead, what you describe is an example of an 'immediately invoked' function.

    I think this dude does a pretty good job of explaining it, if you want to read a more thorough explanation: http://benalman.com/news/2010/11/immediately-invoked-function-expression/

    ReplyDelete
  3. Good post, except the part of using 'undefined'. Undefined is simply an undefined variable, so:

    myUndefinedVar == undefined // true

    but only because they're both undefined/falsy.

    If you want to check if a variable is undefined check it using:

    typeof myUndefinedVar === 'undefined'

    Thát's the true check to see if something is undefined.

    Greets,
    Johan

    ReplyDelete
  4. Deadelus, I hear you, and you are mostly right.

    In the sample I provided, since there's no third parameter to the function being passed in, the third parameter being accepted is the literal "undefined" (except undefined isn't a literal technically, but let's ignore that). So it satisfies the check where myUndefinedVar == undefined within that function. So you and me are talking about exactly the same thing, except that the typeof check becomes unnecessary.

    ReplyDelete
    Replies
    1. I've seen the same technique being used in jQuery's outermost IIFE for getting the undefined value. This way we get the true uncorrupted 'undefined' without the overhead of 'type of'.

      Delete
  5. "It seems that the only real thing you can rely on is that your script tag is on the page (and hence in the DOM)." -- Not even that.

    The script tag may have been removed immediately after being added. That'll keep its execution environment, but the script element is not in the DOM anymore. jQuery's $.getScript() does this, for example: http://balpha.de/2011/10/jquery-script-insertion-and-its-consequences-for-debugging/

    ReplyDelete
  6. Good tips, but I never understood all the anxiety about "undefined." It's just as easy to overwrite any global variable. Consider this statement:

    Math = 123;

    I just blew away the entire Math object. :(

    ReplyDelete
  7. balpha: You are right, if you use jQuery to insert the script tag. However, we are assuming here that the site's developer has been given a script tag to add, and this doesn't use $.getScript to append the script, and adds the script tag directly inline on the page. If the developer decides to use jQuery to do this, it'll be unsupported.

    Adam: You are right. There's an element of (unreasonable?) paranoia on my part. There has been some discussion about this on Reddit as well: http://www.reddit.com/r/javascript/comments/o95na/writing_quality_thirdparty_js_part_1_the_first/

    ReplyDelete
  8. I also dont understand the need of the undefined parameter... It is quite easy to even not use undefined in your code, or you can always check agains void(0) if you ever need to check against undefined.

    ReplyDelete
  9. Nice post !

    It got me started on my way to write third party widgets :)

    ReplyDelete
  10. Starting our app's widget and finally I found a more detailed explanation. Thanks! :)

    ReplyDelete
  11. There is actually a secure way of running 3rd-party code without letting it messup the page or the scope:

    http://asvd.github.io/jailed/demos/web/banner/

    ReplyDelete