Wednesday 8 October 2014

Preventing Flooding

Flooding in the IRC sense, that is. Not the global warming sense, of course. Because global warming isn't real, amirite?

We've all encountered this in the past. Your logs reveal that some user is using a rogue plugin, and the plugin throws errors in a loop. Hundreds, if not thousands of errors get sent to Errorception in a short period of time. This eats into a significant chunk of your daily rate limit, without actually giving you quality data. Some people end up exhausting their daily rate limit in just a few seconds because of this!

No more! Errorception now imposes a per-user rate limit. This is an arbitrary limit to try to separate the wheat from the chaff. Currently, the per-user rate limit is set to 50 errors per 250ms.

Here's how it works: If the user generates more than 50 errors within 250ms, that's some serious error generation going on. Errorception takes 50 of those errors and posts them to the server, just so that you are informed about the problem. Errorception then goes ahead and flags that user as being "banned" until the page unloads. This means that no more errors will be posted to Errorception from this user while he's on that page. That way, you are informed about the problem since 50 errors got posted, but your daily rate-limit isn't completely eaten into.

There's another angle to this problem that has always worried me: If users are generating so many errors in such a short time, and Errorception then tries to process this large number of errors in the browser even if just to upload them, there's bound to be a perceptible sluggishness. That simply isn't cool. Errorception should never cause any perceptible performance lag. Now, with this fix, when a user is "banned", Errorception completely steps out of the way. That way, even though the user is probably stuck generating errors in a loop, at least Errorception isn't causing any additional performance lag. One more feather in the cap for high performance!

As always, suggestion and feedback always welcome.

2 comments:

  1. Hi Rakesh,

    This may present some problems for us due to the architecture of our application. We have a single page webapp, so it's possible that the page is not unloaded for a long time. Would it be possible to change the "unblocking" criteria to be either an unload of a history event?

    - Evan

    ReplyDelete
    Replies
    1. I could do that. It might be easier to do this on a timer, so that the user gets "unbanned" after say 5 minutes, and goes through the same flow again. I'll update this thread when I have this deployed.

      Delete