Tracing errors in client-side JavaScript applications

ARTHR, Newspaper Club’s online layout tool, is what the cool kids call a ‘rich web application’. It’s a Backbone.js application that renders templated Javascript views, generated dynamically.

As you should do with any grown-up piece of software, we put in an error logging system, so if someone manages to trigger a error, we get notified of it and can try to fix it.

We’re using the window.onerror callback, which takes three arguments: the error message, the URL of the script that triggered it, and the line number. On some browsers there are two more parameters: the column (useful for heavily minified code) and the error object, but that’s a fairly recent addition to the spec and not widely supported.

It became quickly obvious that some people have pretty odd browser setups. We spent a while trying to track down errors which turned out not to be in our code, or in any code on that page anywhere. These turned out to be from two sources: JS injected by HTTP proxies, and JS injected by browser extensions.

We fixed the problems with JS injected by HTTP proxies by running the whole application over SSL/TLS. The performance impact is negligible and a whole class of errors disappeared immediately.

And we fixed the problems caused by browser extensions by ignoring all script URLs outside of our domain and that of our CDN. They’ll still cause errors that’ll be visible in the console, but our code won’t trap and log them.

We ended up with a window.onerror function that looks a bit like this:

<script>
  var errorSent = false;

  window.onerror = function (message, file, line, column, error) {
    // If the error has occurred in a file beyond our control, we don't
    // handle the error. That's up to your crazy browser extensions.
    // This isn't a particularly robust check.
    var domainRegexp = new RegExp("^https?://[^.]+\.(newspaperclub|cloudfront)\.");
    if (!domainRegexp.test(file)) {
      return false;
    }

    if (!errorSent) {
      setTimeout(function () {
        var data, stack = '()@' + file + ':' + line;

        if (ARTHR && ARTHR.log) {
          // Add the column of the exception to the stack, if available.
          if (column) {
            stack =  stack + '#' + column;
          }

          // This is what we log.
          data = {
            message: message,
            stack: stack
          };

          // If the browser supports the new error parameter, try to unpack
          // the stack and message and pass it along to the logged data too.
          if (error) {
            if (typeof error === "string") {
              data.error = error;
            } else if (error instanceof Error) {
              data.error = error.name;
              data.error_message = error.message;
              data.error_stack = error.stack;
            } else {
              // Fallback to just logging whatever we've got.
              data.error = error;
            }
          }

          ARTHR.log.error(data);
        }
      }, 10);

      setTimeout(function () {
        if (ARTHR && ARTHR.GlobalNotificationView) {
          var notification = new ARTHR.GlobalNotificationView({
            title: "Sorry, Something Went Wrong",
            description: "<p>We're sorry, something has gone wrong with ARTHR.</p><p>The Newspaper Club team has been notified, but if the problem persists, please <a href=\"http://www.newspaperclub.com/about/contact\">contact us directly</a>, and we'll try and work out what's going wrong. Otherwise, please refresh the page and ARTHR will reload with the latest changes.</p><p><a href='#' class='button' onclick='javascript:window.location.reload(true)'>Restart ARTHR</a></p>",
            noticeType: "error",
            fullScreen: true,
            disableKeyboardClose: true,
            permanent: true
          });

          notification.render().display();
        }
      }, 10);

      errorSent = true;
    }

    return true;
  };
</script>

There’s a few things going on here. Firstly, we’re only catching the error the first time the browser throws an exception, so we don’t get swamped. Secondly, we’re calling our own ARTHR.log function which switches between local and remote logging, depending on the environment. In production it logs to the server over AJAX, so an entry appears in our logging system and in some situations an email is sent out to the team.

The code that displays the error to the user, and the code that sends the log message, are both executed using setTimeout with a short interval (10ms). This makes them run asynchronously, and ensures that the failure of one to execute (due to a bug or an odd situation) doesn’t prevent the other from running.

We still get a class of errors that are difficult to trace. Dumping the entire state of the application might be helpful here, including a snapshot of the DOM, all the bound events and so on. That might be straightforward, but I’ve not looked into it.

It’s often better to try and catch exceptions deeper into the code, rather than letting window.onerror handle it, but for unknown unknowns, this is a useful tool in the debugging arsenal. If you’ve got your own version of this, or there’s a much smarter way of handling JS error tracking, I’d love to hear it.