joshbreckman

Google Analytics Followup...

So yesterday afternoon I received an email from Google saying they were incorporating my change into the next release of Google Analytics. The ga.js file always point's to google's servers, so this should fix the problem fairly immediately once the change goes live.

I also want to address the following two "issues" brought up:

Issue #1: Is Microsoft to blame? Is Google to blame? Why did I say "you should know better" to Google?

Of course Microsoft is to blame for the root problem. They made a crummy web browser and this has made every web developer's life difficult ever since.

However, Google encourages webmasters to put a tiny snippet of javascript on the bottom of every single page on their site, which in turn references a centralized javascript file. This one file is referenced on millions of pages! Any change they make to this file affects these pages immediately! (Well... as soon as people's cache runs out)

Call me what you want, but I think this puts the burden on Google to make sure their stuff really works. It's their job to make sure that Google Analytics stays out of everyone's way and doesn't create any new problems on the web. Even if it means working around Microsoft issues. Any changes to ga.js should be tested in every way on every type of page before being pushed live. It simply affects too many pages not to. They should know better.

Issue #2: Apparently I was being sensationalistic and think that I never make mistakes.

Sorry?

I found a fairly serious problem in a javascript file referenced by millions of pages and I got excited about it. I found a solution and wanted to get it to google to get it fixed.

The only reason I know so much about these memory leaks is that I am often fixing them. It's a pain in the neck, but you learn what to look for after a while.

Someday I'll make a mistake. I'll write a sweet-ass blog entry when it happens.

Google, you should know better. I fixed your memory leak for you.

This morning, I got a report of our CEO complaining of a memory leak when he left our website up over the weekend. This is a fairly common occurrence as he is often gone for several days at a time, uses IE7, and has decided that our website must refresh itself every 10 minutes. (We don't have ads on our pages, he just wants it to always be up-to-date on people's computers)

When our page was refreshed, we seemed to lose 100k-900k. Left over a weekend it got up to 200mb+ very easily. Sort of a scary issue.

We investigated, and after some digging figured out that our recent addition of Google Analytics was to blame. Surely Google Analytics, used by thousands upon thousands of websites, couldn't be to blame? Plus it's GOOGLE! Google doesn't make mistakes like that. However, after some quick searches, it became pretty obvious we weren't the only ones to discover this.

After we verified that Google Analytics was really at fault, I tried to dig into the source. They obfuscate/compress the code pretty good, so I ran it through an online javascript beautifier to try and figure out waht is going on.

It didn't help much, so we started looking for some likely culprits. Anything ActiveX related, etc. Most of those things you could turn off, and seemed to have to have no effect our memory leak.

And then I found it. Something that Google should just know. The problem with IE has been very well documented before and should be well understood by any javascript developer working today.

Basically, IE has two sets of memory that get garbage collected: the DOM objects and the javascript objects. Serious problems arise when javascript objects reference DOM objects, or those DOM objects somehow reference a javascript object. This latter scenario is often done via callbacks and anonymous functions. The garbage collector doesn't know what to do, and ends up not collecting something it otherwise should.

What does this have to do with Google Analytics? I found two bits of code that followed this pattern:

if (0 == z || 2 == z) {
    var A = new Image(1, 1);
    A.src = h.Da + l;
    var p = 2 == z ?
    function() {}: x ||
    function() {};
    A.onload = p
}

The key bit here is "A.onload = p". They are using an image as a low-budget AJAX call to report stats up to their server, and are apparently doing something upon completion of that call. The problem is that if not properly cleaned up, that image will keep a reference to the Google Analytics object, and neither will get cleaned up properly.

I changed that bit of code to:

if (0 == z || 2 == z) {
    var A = new Image(1, 1);
    A.src = h.Da + l;
    A.onload = function() 
        { 
            A.onload = null;
            if ((z != 2) && (x != null)) 
                x(); 
        };
}

... and our memory leaks went away.