Wednesday 5 November 2014

Enabling CORS on Amazon CloudFront with S3 as your Origin Server

Today I was debugging a customer's CloudFront setup to ensure that they were supporting CORS correctly. Amazon has documented the process, but the docs seem to be structured to work as a reference rather than a how-to. I thought I'd document what I had to do to get things working right, so that this serves as a starting point for others to get set up.

As you probably know, enabling CORS is important if you want to catch cross-domain JavaScript errors. If you are using CloudFront as a CDN, you are most likely using a different domain (or subdomain) to serve your files, and will need to set up CORS at CloudFront.

The bulk of the surprises with setting up CORS with CloudFront are with configuring S3 correctly. This is the typical setup for most people (CloudFront using S3 as their Origin Server), so you'll probably have to deal with this first.

Configuring S3

S3 has this unnecessarily complicated "CORS configuration" that you need to create. Here's the steps to get that right:

  • Log into your AWS S3 console, select your bucket, and select "Properties". S3 CORS configurations seem to apply at the level of the bucket, and not the file. I have no clue why.
  • Expand the "Permissions" pane, and click on "Add CORS configuration" or "Edit CORS configuration" depending on what you see.
  • You should already be provided with a default permission configuration XML. (Seriously, Amazon? 2014? XML?) If not, use the following XML to get started.
    <?xml version="1.0" encoding="UTF-8"?>
    <CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
        <CORSRule>
            <AllowedOrigin>*</AllowedOrigin>
            <AllowedMethod>GET</AllowedMethod>
            <MaxAgeSeconds>3000</MaxAgeSeconds>
            <AllowedHeader>Authorization</AllowedHeader>
        </CORSRule>
    </CORSConfiguration>
    

    You should look at Amazon's docs to see what this configuration means.

    In the course of this debugging exercise, I discovered the hard way that Amazon's XML parser cares about the <?xml ?> declaration, and the xmlns on the root node. If you omit these, Amazon will fail silently, showing you a happy looking green tick! (Can you imagine how hard it was to figure this out?)

  • Once you've saved the configuration, go get a coffee (or other preferred poison) while you wait for S3 to be one with your new configuration, and really internalise it's true meaning. (It takes a couple of minutes. Some sort of caching, I guess.)
  • Test if everything's looking right. This really tripped me up. Turns out, for a really complicated reason, you can't simply hit a URL in your S3 bucket from within your browser and check the "Network" tab in your dev tools. That would be too easy. No, you need to ensure that you specify certain extra headers in your request. Now, it turns out that browsers send this with CORS requests anyway, so you are covered in the real-world, but it's crazy that the dev-experience for the common case isn't what you'd expect. You could use a tool like curl to specify the additional headers needed for a "correct" CORS request:
    $ curl -sI -H "Origin: example.com" -H "Access-Control-Request-Method: GET" https://s3.amazonaws.com/bucket/script.js
    
    HTTP/1.1 200 OK
    Date: Wed, 05 Nov 2014 13:37:20 GMT
    Access-Control-Allow-Origin: *
    Access-Control-Allow-Methods: GET
    Access-Control-Max-Age: 3000
    Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
    Cache-Control: max-age=604800, public
    ...snip...
    
    You should see the "Access-Control-Allow-Origin: *" header, and the "Vary: Origin" header in the output. If you do, you're golden.

With that, you are almost done! CloudFront's configuration is a piece of cake in comparison.

Configuring CloudFront

  • Go to your CloudFront console, select your distribution and go to the "Behaviors" tab.
  • You should already have a "Default" behavior listed there. Select it and hit "Edit".
  • Under the "Whitelist Headers" section, navigate their clunky UI to add the "Origin" header to the whitelist.
  • Save, get another coffee, and wait for this to propagate through CloudFront's caches. This will take some time.
  • Test! Again, you will have to use the process above to make sure you are flipping the right switches within Amazon. That is, use curl (or some HTTP client), and ensure that you specify the extra headers. You should see the "Access-Control-" headers in the response.

There you go! That should get you set up. I can't believe Amazon has made it so complicated to essentially send an additional HTTP header. Well, regardless, I hope this post helps you get set up correctly.

You will also need to modify your script tags to ensure that you catch JS errors correctly. You can read more about that in the docs.

Not catching JS errors yet? You should really give Errorception a shot.

Wednesday 8 October 2014

Preventing Flooding

Flooding in the IRC sense, that is. Not the global warming sense, of course. Because global warming isn't real, amirite?

We've all encountered this in the past. Your logs reveal that some user is using a rogue plugin, and the plugin throws errors in a loop. Hundreds, if not thousands of errors get sent to Errorception in a short period of time. This eats into a significant chunk of your daily rate limit, without actually giving you quality data. Some people end up exhausting their daily rate limit in just a few seconds because of this!

No more! Errorception now imposes a per-user rate limit. This is an arbitrary limit to try to separate the wheat from the chaff. Currently, the per-user rate limit is set to 50 errors per 250ms.

Here's how it works: If the user generates more than 50 errors within 250ms, that's some serious error generation going on. Errorception takes 50 of those errors and posts them to the server, just so that you are informed about the problem. Errorception then goes ahead and flags that user as being "banned" until the page unloads. This means that no more errors will be posted to Errorception from this user while he's on that page. That way, you are informed about the problem since 50 errors got posted, but your daily rate-limit isn't completely eaten into.

There's another angle to this problem that has always worried me: If users are generating so many errors in such a short time, and Errorception then tries to process this large number of errors in the browser even if just to upload them, there's bound to be a perceptible sluggishness. That simply isn't cool. Errorception should never cause any perceptible performance lag. Now, with this fix, when a user is "banned", Errorception completely steps out of the way. That way, even though the user is probably stuck generating errors in a loop, at least Errorception isn't causing any additional performance lag. One more feather in the cap for high performance!

As always, suggestion and feedback always welcome.

Friday 19 September 2014

Fresh Coat of Paint

Errorception just got a redesign! Give it a look! I'm particularly excited about the revamped docs.

Not only is the new site a huge visual improvement, it's also far more functional and accessible, and much more secure thanks to a very strict Content Security Policy (CSP). It also has a fully buzzword-compliant front-end stack now, while maintaining the blazing performance of a very light-weight site.

The logged-in area of the site hasn't been redesigned yet, but there has to be something for next time, right? ;)

Let me know what you think! Feedback welcome, as always.

Tuesday 26 August 2014

Private Source Maps

Since the launch of source maps, you have been asking for a way to support source maps without having to make all your code public. Today I'm glad to announce private source map support in Errorception.

Rather than explain how the feature works, I shall take you through the thought process behind private source maps support. It has taken 2 months to fine-tune the experience. I was actively working with a bunch of people in a private beta for this feature. (Thanks to you fine folk. You are awesome!)

A quick primer: Source map files, generated by your minifier, contains a mapping between your original source code and your minfied code. Your source map files are linked to from within your minified code, using a //#sourceMappingUrl=... comment in your minified code, inserted by your minifier. The source map file doesn't contain the original source code itself — just the mappings. Instead, the source map file links to your source code, which a consumer (like Errorception, or your browser's dev tools) will have to fetch separately.

The First Cut

When I had launched source map support two months ago, I had already written the code for private source maps too. The way I intended it to work was that I'd ask you to upload your source map files and your source code to Errorception as part of your deploy script. I wasn't sure how everyone would take to the idea of making an API call from within their deploy script, so I decided to start a private beta and work with a few select people directly to see what their experience would be.

However, everyone disliked the idea of making API calls from their deploy scripts. Every single one of them! Here's the big reasons:

  • It introduces a remote network dependency in the deploy script. This means that the deployment's success or failure depends on a third-party service (Errorception in this case), and on the network. This simply isn't a good idea.
  • If, for some reason, the API call fails, it will cause inconsistency in data at Errorception's end. It's not just that source maps won't work, it's worse than that — source maps will point to the wrong location in your files! That's misleading, and actively inhibits easy debugging. Not good at all.
  • Making API calls isn't really the simplest of things to do, especially when you have to do it from a deploy script. You'd first have to prepare your bundle by zipping up your files (which is likely to be a pretty big zip), then send it across to Errorception using some HTTP client, after you've figured out how to do multipart uploads. To deal with failures, you'd also have to implement some kind of retry mechanism. It doesn't have to be this complicated.
  • As an aside, this is also tremendously wasteful. Unless your errors are all over your code-base, Errorception doesn't need all your files at all. The bulk of the zip you'll have uploaded won't be useful for debugging at all, but there's no way to know which parts are needed beforehand.

Errorception's Crawler

Instead of asking you to upload your source maps and source code to Errorception, we decided that you could instead upload the files to your own web server/CDN. This is much more easy to do, considering that your deploy script already does this with your built code. It also eliminates the external network request to Errorception at deploy-time, which makes your deployment script simpler and far more reliable.

Errorception already has a crawler that crawls your site to get the required JavaScript files, needed both for the code view and for source maps support. This crawler is actually pretty powerful, and has evolved quite a bit within its few months of existence. It has baked into it a couple of really cool ideas to ensure consistency of data. This ensures that Errorception can always show you the correct version of the file that caused the error, even if the file has since changed with newer deployments. It is also resilient to network failures, retrying failed requests intelligently, and retroactively updating existing errors with the file. (Building this has been pretty crazy engineering-wise — heck, I even had to create a versioned file store to manage file versions correctly!)

Let's say you have an error in one of your script files. Errorception's crawler parses your JavaScript file to look for the //#sourceMappingUrl pragma comment. If it finds this comment, it already has everything it needs to crawl your source map files and your source code. This is how public source maps work already.

However, many people would rather not have that //#sourceMappingUrl comment in their code. That's because this comment is the one link to all of your code, and will let anyone with a browser access the original unminified source code.

Private Source Maps

If this //#sourceMappingUrl comment is removed from your minified file, your source maps are now effectively private. This is because no one can know where you've put your files if there isn't a link pointing them to them. HTTP doesn't have any discovery mechanism built in, and a secret path is just as unguessable as a password, since no one else knows the secret. (This assumes that you don't have directory listing turned on.)

So, this is how private source maps work in Errorception: You specify a secret folder name in Errorception's Settings > Source Maps. This secret folder should be as unguessable as you would want a password to be. Then, modify your build/deploy script such that Errorception can find your source map on your web server by constructing a path that incorporates this secret folder. (More about this below.) Once the crawler gets your source map file, it has everything it needs to figure its way about your code.

Examples

Here's how Errorception uses your secret folder to discover your source map file: Let's say an error occurred in your script at http://example.com/script.js, and you've specified your secret folder to be deadbeef, Errorception will look for the source map at http://example.com/deadbeef/script.js.map. That is, it looks inside a secret folder (which is expected to be a sibling of the script file), for a file that has the same name as the script file with a .map appended to it.

To give you another example, if the error was in http://example.com/a/b/c/script.js and you specify your secret folder to be secret, Errorception will look for the source map at http://example.com/a/b/c/secret/script.js.map.

All of this sounds complicated, but it really isn't. In fact, in most cases, it will simply be one or two lines in your deploy script — to strip the //#sourceMappingUrl comment, and to copy your source map files and original source code to the secret folder. Doing stuff like copying and modifying files is exactly what deploy scripts are good at, so it plays to the strengths of the deploy script too.

But this isn't really private at all

Yes, in a sense, this is really only security by obscurity. However, it is security by obscurity in the same way that passwords are security by obscurity. As with good passwords, a good folder name would be just as unguessable. Since it is impossible to discover anything over HTTP if you don't explicitly link to it, unwanted access to your source code should be near-impossible.

That said, I can see how you might be worried that all your files are still public. I'm open to consider even more stringent security, if you like. Feel free to get in touch. However, like I said, you shouldn't have to worry about it in the first place.

Also, Errorception turned three last week. Drink one for Errorception! Cheers!

Tuesday 24 June 2014

Source Maps Are Here!

In the previous blog post I talked about the exciting new feature of highlighting exactly where the error is, in your code. The fact that this is even possible to do externally, is the kind of stuff that distinguishes JavaScript from all other languages. It is why Errorception has this singular focus on JavaScript.

This post is to highlight one more such feature — source maps.

If you have errors in your minified code and Errorception's crawler discovers a source-map file associated with your code, it downloads the source-map file and your associated original source files, and versions & saves it in Errorception's file store. After that, all the goodness of pointing out the error isn't just applied to your minified file, but also to your original source-code.

Mapped! Showing you the error in your un-minified source!

Errorception shows you exactly where the error is in your original, unminified source-code. Not just that, it does all of this automatically, and across all your stack frames! Isn't that just awesome?

You just need to make your source-map file available, and Errorception will do the rest. The tweaks needed to your build script are real simple too — it's usually just a flag in most minifiers.

Your minified code is just a click away. Note the tabs at the top-right of your code.

Source maps have been around for some time now. However, I didn't want to implement source maps just so Errorception could wear it as a badge — I wanted to make it actually useful to you. The previous release was a step in this direction — putting your code front and center, and pointing out exactly where the error was in your deployed code. Now, source-maps completes this by not only pointing out exactly the token that caused the error, but also by doing so in your original unminified source file.

Oh, and of course, this also means that Errorception now supports compile-to-JS languages as well. CoffeeScript, TypeScript, ClojureScript and others, welcome to Errorception! You should feel at home.

As always, suggestions and feedback always welcome. Mail, Tweet, or leave a comment below.

Thursday 19 June 2014

Here's Your Error!

Today's release is a game changer!

Now, whenever possible, your code and stack-traces take center stage in your error reports. Errorception looks at your code and the data from the error, and attempts to point out where exactly in your source file the error occurred.

Although the logic that points out your errors only gets incomplete data to work with, the predictions it makes about the error's cause are stunningly accurate in most cases. Not only does it point out the exact line and column number of the error, it also tries to make sense of your code to find the exact keyword or token that caused the error. And it does this across all stack frames!

Two stack frames of an error

See that screaming out: "Here's your error!" Isn't it just amazing how accurate it is? Do you see how easily you will be able to smash bugs with this?

Browser support

This feature relies somewhat heavily on stack-traces being available. All recent versions of Chrome (Desktop and Android) provide stack-traces in window.onerror out of the box, so you are already covered there. Stack-traces in window.onerror are new, having made it into the spec only recently. Firefox just implemented this about a week ago, so I expect the next release will ship with window.onerror stack traces. This also works for errors from IE10 for at least the first stack-frame, since IE10 provides a column number for every window.onerror error.

Additionally, this works just about everywhere if you .push your errors to Errorception. Until recently, Firefox didn't provide column numbers in their stacks, but that just got fixed a couple of days ago, expanding browser support to all popular browsers. In older versions of Firefox, you should still be able to see the highlight for the first stack-frame for .pushed errors.

It even works accurately with minified code!

If it isn't possible to highlight the offending keyword/token, Errorception will attempt to highlight the offending column. Errorception will definitely highlight the line of the error in all cases regardless.

The tech

The engineering needed to pull this off has been significant — certainly one of the more complex things I've worked on. At its heart lies a file-store, written from scratch, purpose-built for just this job. A crawler spiders your site for your JS files when errors occur, and saves them in the file-store, which caches and versions your files. This caching and versioning ensures that you see the rightest possible version of the file that caused the error, even if the file might have subsequently changed with new updates. Your files and the history of your error occurrences are then mined to figure out what might have caused the error.

There are no settings to be configured, no knobs to be turned — It All Just Works. Best of all, this is all done without any extra work on the client-side at all, so your users don't face any performance penalty whatsoever.

This release has been brewing for months, and has been in closed beta for a few weeks now. And today's feature isn't even the main reason I built all of this — it's just an awesome side-effect. But I'll save those details for the next blog post. (Can you spot it?)

As usual, feedback always welcome. Mail, Tweet or leave a comment below.

Tuesday 25 March 2014

Fine-grained control over error posting

While Errorception has given you some control over error posting since a very long time, this has at best been very coarse-grained. Now, that gets fixed.

You now have full programmatic control over which errors get posted to Errorception. You just have to define _errs.allow to be a function that returns true or false. Examples are the best way to demonstrate this, so here goes:

To ignore all errors from say IE6:

_errs.allow = function() {
  return (navigator.userAgent.indexOf("MSIE 6") == -1);
}

To only allow errors from yourdomain.com and its subdomains:

_errs.allow = function() {
  return (location.hostname.indexOf("yourdomain.com") != -1);
}

You also get the error that's about to be posted as an argument in this function. The error is represented as an object with three properties: the error message, the line number and the script-source URL of the error. So, to ignore all errors that are from ad-script.js.

_errs.allow = function(err) {
  return (err.url.indexOf("ad-script.js") != -1);
}

On a side note, this indexOf and -1 business above is so ugly! String.prototype.contains can't come soon enough!

This was a fun feature to build, especially because of a very interesting corner-case. All of this has been well documented, so give the docs a look. It's very interesting how Errorception uses itself to log errors encountered in this edge-case in a way that doesn't cause the world to implode.

Monday 13 January 2014

Country of Origin

Sometimes, when debugging client-side errors, knowing where the user is from can be useful. For example, I recently had a situation where I had debug an error that only occurred for users behind the Great Firewall of China. Granted these kinds of issues only crop up rarely, but at such times knowing that this error only occurs in certain geographical locations can be immensely useful when trying to debug.

Errorception now displays the country of origin of the error occurrence, whenever possible. Due the very nature of IP-address-based geo-location, the location accuracy can only be very coarse-grained, and might even be entirely wrong, but is right most of the time. Also, I only started collecting geo-data over the weekend, so older error occurrences will not have geo data tagged along with it.

Happy debugging!