{errorception} blog

Saturday, 17 December 2011

Control Error Posting Better

One constant problem with Errorception is managing the huge influx of errors, because Errorception does such a good job of recording errors. (You see how I sneaked that last bit in?) We've always had automatic "muting" of possibly uninteresting errors. Some time ago, we started automatic marking bugs as duplicates, which brought down the number of errors that you had to deal with drastically.

Today, we've launched yet another feature to help reduce the number of errors you need to care about. You can find this under "Settings > Posting errors".

Allowed domains

You can now specify a white-list of domains from which you want to record errors. So, if you whitelist www.mydomain.com, errors from localhost will not be recorded. You can specify as many domains as you want - just comma-separate them. This also helps safeguard you from people who might have stolen your tracking code and are spamming you.

Ignored scripts

You can specify a black-list of paths for script files. If errors occur in script files matching this file path, such errors are not recorded. This means that you can now block your third-party scripts for example from posting errors to Errorception. So, no more do you need to get errors from the Facebook Like button, or from your Customer Support chat widget.

Also, as a bonus, if the errors are ignored due to any of the reasons above, such errors will not count against your rate limit. This gives you more room every day to record the errors you really care about.

Lesser errors = happier developer, right?

Tuesday, 13 December 2011

Call Stacks in IE

Over the weekend I rolled out a build that provided call stacks for certain errors, under certain conditions. You only get a call stack if:

The error occurs in Internet Explorer,
The error happens after page load

I must admit, because of these conditions I was unsure how useful the feature will be. But after having it run for some time now, it looks awesome!

If we were able to capture the call stack of an error, it is highlighted in the errors listing.

The call stack is shown in the error details page for the error.

How it works

One interesting behaviour of IE is that when window.onerror is called, it doesn't actually destroy the call stack — in fact, the window.onerror function call is placed on the top of the currently executing stack. This is different from how all other browsers behave. We exploit this behaviour in IE.

However, IE doesn't give you a nicely formatted stack trace. In fact, there's no explicit way to get the call stack at all. However, IE does give you a meaningful arguments.callee.caller, which is used here. We then recursively walk the call stack using arguments.callee.caller.caller etc. to build individual stack frames of the error. We do this for 10 stack frames and stop there, just in case the call stack is for a recursive function.

`arguments.callee`?

arguments.callee refers to the currently executing function. It's a way for a function to know itself, if you will. In JavaScript, since every function is also an object, the function has it's own properties as well. One such property of a function is its .caller, which is a reference to the function that called it. In other words, it's a reference to the function one stack frame below itself.

In the case of window.onerror, arguments.callee will refer to the window.onerror handler itself, since it points to itself. Once we have a handle to our own function, we can then access properties of our function, as described above.

There's one caveat though. Only IE retains a meaningful arguments.callee.caller. In all other browsers, this value is null. That's because all other browsers destroy the call stack before calling window.onerror. IE retains the call stack. So, since we have a call stack in IE, we can now recursively call .caller for each function in the stack to know the previous function in the stack.

Why after page load?

As I've mentioned several times before, Errorception will always maintain a zero performance cost. I wouldn't use a error reporting system that would cause page load delays for every single user. To ensure that the performance cost is zero, Errorception introduces it's script after page load. This doesn't mean that we don't catch errors from before page load, of course. We do. But we only process them once our script has loaded.

Because of this loading pattern, we won't have access to call stacks from before page load. arguments.callee.caller will always point to null since the stack has already unrolled. This is why we cannot generate call stacks before page load, even though we have all other error details.

But that sucks!

Not really, if you think about it. While it would be awesome if we could get call stacks from before page load as well, these errors are really easy to replicate. That's because the error happens in such a small and predictable time interval that it's easy to recreate it locally. The error happens between when the page started loading and when page load was fired - usually within a couple of seconds, maybe before any user interaction has even occurred.

Caveats

In a lot of cases, the call stack consists of anonymous functions, which isn't very descriptive. I would ask you to rewrite your code using named anonymous functions, but we all know that no one's going to do that. I'm still on the lookout for a decent way to solve this. If you have any suggestions, I'm all ears.

Saturday, 10 December 2011

MOAR Stats!

Just released a build that takes all your error data and makes pretty little pie charts out of it. Who doesn't like pie charts, right?

To access this data, hover over your projects menu at the top right when you log in, and click on the "See more stats" link in the drop-down.

Monday, 5 December 2011

Paid plans - ACTIVATE!

I've just opened up sign up for the paid plans. The paid plans had been announced some time ago, and I was waiting for the paperwork to go through for launch.

The paid plans are available on the pricing page. As always, your feedback is always welcome.

Monday, 28 November 2011

Inline Code Previews

Just pushed a build up that eliminates a minor annoyance. If you now click on an error's file and line number, you will get a popup modal that loads up the file in which the error occurred, and highlights the line at which the error occurred.

Only a minor enhancement, but let's you avoid the little dance that you have to do: Alt+Tab / Cmd+Tab to your editor, open the right file, and jump to the right line.

This is only available when the error occurs in an external JS file, since inline-scripts may be on dynamic pages and the line numbers may be off. But it shouldn't affect you because you don't write any of that dirty inline code, right? Right? :)

Sunday, 13 November 2011

Announcing Paid Plans

Just announced paid plans for Errorception two days ago. I'm still getting all the machinery in place to get it to work perfectly (and so one can't really sign up for them yet), but I thought I'd get my thoughts across even before I have everything in motion.

The plans are based on resource-consumption (because I believe that artificial restrictions are meaningless), and storing and processing errors are the most resource consuming activity in Errorception. So, based on how many errors you expect to have, the pricing varies.

Pricing starts at only $14/month, so it's rather cheap, considering that you find out about problems faced by thousands of your users every day for that price. There's also a free trial so you can give it a spin without having to whip out cash first.

There are a couple of ancillary changes that deserve mention as well:

Signup is now public. It was invitation based before. Now that hurdle has been removed.
Signup is only open for the free trial. I hope to open up the paid plans for signup as soon as possible. I'm working hard on this.
All existing projects have been moved to the most expensive plan for free, and will remain there for some more time. The benefits of being on-board early!
Oh, and the site has been re-designed. Responsive too! Tell me what you think.

The pricing is not written in stone, so if you can change my mind I'm open to suggestions. Please feel free to get in touch if you have any thoughts.

Monday, 10 October 2011

Mark as Duplicate

Among the few people who have access to errorception right now, some of them are very high traffic sites. More traffic leads to more errors recorded, for obvious reasons. That's awesome because it's a great way to improve the site's UX. However, it had become a challenge to manage these large numbers of errors. It's only human nature that when there's a lot of noise you start missing the real signals. Something had to be done to manage errors better.

A quick look at other bug-reporting systems to see how they handle noise like this, and it's obvious that a "mark as duplicate" functionality was sorely needed in errorception. However, no one likes to be the guy marking bugs as duplicate. It requires review, testing, some familiarity with the code, and usually a lot of discussion around it. Not to mention that you need to remember that you had seen a similar bug filed before, find it, and cross-link them when marking as a dupe. Painful.

Errorception is in a unique position, though. Since errorception already knows a lot about the bug (errorception filed the bug report after all), it should be easy to reliably figure out which bugs are duplicates of each other, and mark them as such.

Yesterday, I rolled out an update that automatically marks duplicates of different errors. It's super simple to use - there's literally nothing you need to do! All old errors in the system, and new errors that will be logged are now all going to automatically mark themselves as duplicates of each other whenever appropriate.

The functionality is currently a little conservative, so the signals it uses doesn't work very well with inline scripts in dynamic pages. However, even with this conservative approach, the number of errors you need to worry about has come down by 55% on average, and as much as 75% in some cases! Less errors = happy developer. :)

Feedback welcome, as always.

Monday, 3 October 2011

Slides from JSFoo, Bangalore

I presented Errorception at JSFoo in Bangalore this weekend. Since I hate out-and-out product pitches, I decided to turn it on its head a bit and talk about the considerations I had in mind when taking the decisions that I did for errorception, and what it's pros and cons are. The slides is almost completely full of different error catching techniques in JavaScript. These techniques are graded on a scale I made up based on my opinions about what an error catching mechanism's characteristics should be. The scale is so brutal, even errorception doesn't get full marks. :)

View slides about error-catching techniques and how errorception works.

Overall, the talk was well received. Discussions during the talk were great, and it generated a good deal of chat after the talk. Daily signups went up 10x in the three days now since the talk. I call success!

Wednesday, 7 September 2011

Errors on https are now supported!

It's been less than three weeks since errorception went live, and the feedback has been awesome! I've been iterating quickly to improve based on your feedback. The deployment I did earlier today was a significant one, and it deserves a whole blog post.

HTTPS is now supported. This is the big one. So far, the tracking snippet used to disable itself on https pages. Now, this restriction has been lifted. You can now track errors on secure pages as well.
Smaller tracking snippet. The tracking snippet has lost 10% of it's weight, giving it even higher load-time performance. In the process, it has also become slightly faster at runtime because of changes in the way some parameters were being recorded. #win
Better debugging. The previous tracking snippet was swallowing up errors, so it was very hard to debug your code. The new snippet doesn't swallow up errors, allowing your regular tools (Firebug, Web Inspector, etc.) to log the error as it happens in the debugging console.
Batched upload of errors. One the lessons I learnt was that in many cases if there is an error in the JS, it has a cascading effect and creates several multiple errors. For example, if jQuery didn't load correctly for some reason all the plugins and your code would throw separate errors. Previously, each of these errors were pushed to the server separately. This caused many problems. For example, in some browsers it hit the http connection limit, essentially queueing up errors. This meant that newer errors might never be posted to the server if the user navigated away. Also, it's terribly inefficient to create so many http connections anyway. Now, the errors are batched up on the client before they are posted to the server, so that several errors can be sent to server at one time. This is far more efficient.

Unfortunately, to take advantage of most of these features you'll have to modify the tracking snippet on your site. I encourage you to login and get the new snippet at the earliest. Though I've maintained backwards compatibility for now, you should update as soon as possible to take advantage of the new features.

As always, your feedback and thoughts are welcome.

Monday, 15 August 2011

Reliable, High Performance JS Error Tracking

At first glance, errorception seems to be demanding a lot. Firstly, you have to add a script tag on your page, a tag given by me - some third-party indie developer. Secondly, the script tag has to be placed in the head of the page. That's just ridiculous!

It should bother you that there's such a requirement. If you don't know why, I'll quickly summarize:

A number of things could go wrong when fetching the script from our servers. What if our servers are down? What if the DNS can't be resolved? Are you going to hold your page-load hostage just because you want to track errors?
Errorception could be malicious. We could be in your pages, messing with your pages. (This is a huge security issue, though I won't be addressing it in this post.)
Anyone familiar with YSlow! or PageSpeed will tell you that the head tag is the worst place for putting a script tag, because of the performance penalties associated with blocking network requests for scripts.

How audacious of me to ask you to do this!

The design goal from the outset for errorception has been exceptional high performance. You should have zero impact from errorception on account of any possible network latency between your users and our server, and zero impact if our servers go down or are otherwise unreachable. Yes, you read that right - ZERO impact.

Here's a quick outline of how our script snippet works:

The first thing it does is create a client-side local queue for processing errors. Any error that occurs on the page is pushed into this queue. This is done completely inline on your page, so there's no network request and associated performance latency at all.
Next, it waits for page load to occur. Only after page load does it inject our external script asynchronously into the page which then processes this queue.

This gives us huge advantages:

We can trap errors very early in the page cycle. Even before our own script has loaded from over the network!
Since we do not introduce any network latency in the process of loading the page, your page load time has no impact whatsoever.
Your scripts will typically kick in far before the page loads. If you're using $.ready or other similar functions in your favourite library, chances are you are using DOMContentLoaded. This means that your scripts would get a head-start for execution, while the errorception script's network request wouldn't even have gone out yet! I have completely eliminated the possibility that the script could interfere with your page load time. The script could even fail completely, and it will have no impact at all.

Put another way, I simply cannot mess with your page's performance characteristics. It's impossible for me to do so. Even if errorception's servers crash and burn, it cannot affect your page performance. The need for trust has been eliminated. Errorception is fast because it proactively gets out of the way. Errorception is reliable because it cannot impact your site negatively. That's the way it's designed.

There are very few third-party tracking scripts that do this. Google Analytics comes the closest with the asynchronous tracking script. And even then, they only help you in your rendering speed. Your window.onload is still at Google's servers' mercy. I can't believe there are so few people doing this right - honestly, it's not even that hard!

There is a minor disadvantage to this approach though. It's possible that your users will go click-happy on your page, and they might be navigating between pages before page-load is fired. In that case the local queue wouldn't be flushed to errorception's server, essentially meaning that we wont be able to track such errors. I'm fully aware of this, and took the call that that was fine. The philosophy in three words is: "Performance over accuracy". I'd err on the side of not recording data, rather than doing anything to hamper your site's performance.

For those who are interested, the following is the extra-verbose version of the tracking script.

Don't worry that it looks too big. It actually compresses very well (298 bytes gzipped):

Thursday, 11 August 2011

Introducing Errorception

I've been a client-side JavaScript developer for most of my professional life. Being a JS developer is very different from being a server-side web developer, and most people don't understand or appreciate the difference. This post is about one of those differences.

In sharp contrast to server-side setups, my JS doesn't run in controlled environments - I don't have any control on either the runtime or the OS, or for that matter almost any aspect of how the application will run. It's code running literally in the wild.

Several times, users call in to say that "the website doesn't work", and I end up having no idea why. I have no insight into the end-user's machine or browser. It's like debugging with a blindfold on. It's beyond painful. It worries me that for every user who called in, there might be many more who didn't. At best (if you can call it that) it's a horrid user-experience; at worst it's lost revenue. Dayam!

It doesn't have to be that way. Errorception aims to solve a simple problem: Get some insight into what is happening at the user's end when an error occurs. Gather as much data as possible to describe the state of the runtime when your application encountered the error. And do this with a strong philosophical stand: that any performance hit is simply not tolerable, whether run-time or load-time.

As a teaser, here's a screenshot of the errorception admin UI on my current dev box. It's still rather early, and things are likely to change. If you have any thoughts, I'd love to hear from you.

If you've signed up for early access at errorception.com, I'll be sending you invites in a couple of days, and you'll get to use the app for free for a limited period. If you haven't signed up, I urge you to, so that can get priority access.