Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page! Currently only showing blog entries under the category: JavaScript. Clear filter

I just rolled out a change here on my personal blog which I hope will make my few visitors happy.

Basically; when you hover over a link (local link) long enough it prefetches it (with AJAX) so that if you do click it's hopefully already cached in your browser.

If you hover over a link and almost instantly hover out it cancels the prefetching. The assumption here is that if you deliberately put your mouse cursor over a link and proceed to click on it you want to go there. Because your hand is relatively slow I'm using the opportunity to prefetch it even before you have clicked. Some hands are quicker than others so it's not going to help for the really quick clickers.

What I also had to do was set a Cache-Control header of 1 hour on every page so that the browser can learn to cache it.

The effect is that when you do finally click the link, by the time your browser loads it and changes the rendered output it'll hopefully be able to do render it from its cache and thus it becomes visually ready faster.

Let's try to demonstrate this with this horrible animated gif:
(or download the screencast.mov file)

Screencast
1. Hover over a link (in this case the "Now I have a Gmail account" from 2004)
2. Notice how the Network panel preloads it
3. Click it after a slight human delay
4. Notice that when the clicked page is loaded, its served from the browser cache
5. Profit!

So the code that does is is quite simply:

$(function() {
  var prefetched = [];
  var prefetch_timer = null;
  $('div.navbar, div.content').on('mouseover', 'a', function(e) {
    var value = e.target.attributes.href.value;
    if (value.indexOf('/') === 0) {
      if (prefetched.indexOf(value) === -1) {
        if (prefetch_timer) {
          clearTimeout(prefetch_timer);
        }
        prefetch_timer = setTimeout(function() {
          $.get(value, function() {
            // necessary for $.ajax to start the request :(
          });
          prefetched.push(value);
        }, 200);
      }
    }
  }).on('mouseout', 'a', function(e) {
    if (prefetch_timer) {
      clearTimeout(prefetch_timer);
    }
  });
});

Also, available on GitHub.

I'm excited about this change because of a couple of reasons:

  1. On mobile, where you might be on a non-wifi data connection you don't want this. There you don't have the mouse event onmouseover triggering. So people on such devices don't "suffer" from this optimization.
  2. It only downloads the HTML which is quite light compared to static assets such as pictures but it warms up the server-side cache if needs be.
  3. It's much more targetted than a general prefetch meta header.
  4. Most likely content will appear rendered to your eyes faster.

So I have a massive chunk of JSON that a Django view is sending to a piece of Angular that displays it nicely on the page. It's big. 674Kb actually. And it's likely going to be bigger in the near future. It's basically a list of dicts. It looks something like this:

>>> pprint(d['events'][0])
{u'archive_time': None,
 u'archive_url': u'/manage/events/archive/1113/',
 u'channels': [u'Main'],
 u'duplicate_url': u'/manage/events/duplicate/1113/',
 u'id': 1113,
 u'is_upcoming': True,
 u'location': u'Cyberspace - Pacific Time',
 u'modified': u'2014-08-06T22:04:11.727733+00:00',
 u'privacy': u'public',
 u'privacy_display': u'Public',
 u'slug': u'bugzilla-development-meeting-20141115',
 u'start_time': u'15 Nov 2014 02:00PM',
 u'start_time_iso': u'2014-11-15T14:00:00-08:00',
 u'status': u'scheduled',
 u'status_display': u'Scheduled',
 u'thumbnail': {u'height': 32,
                u'url': u'/media/cache/e7/1a/e71a58099a0b4cf1621ef3a9fe5ba121.png',
                u'width': 32},
 u'title': u'Bugzilla Development Meeting'}

So I thought one hackish simplification would be to convert each of these dicts into an list with a known sort order. Something like this:

>>> event = d['events'][0]
>>> pprint([event[k] for k in sorted(event)])
[None,
 u'/manage/events/archive/1113/',
 [u'Main'],
 u'/manage/events/duplicate/1113/',
 1113,
 True,
 u'Cyberspace - Pacific Time',
 u'2014-08-06T22:04:11.727733+00:00',
 u'public',
 u'Public',
 u'bugzilla-development-meeting-20141115',
 u'15 Nov 2014 02:00PM',
 u'2014-11-15T14:00:00-08:00',
 u'scheduled',
 u'Scheduled',
 {u'height': 32,
  u'url': u'/media/cache/e7/1a/e71a58099a0b4cf1621ef3a9fe5ba121.png',
  u'width': 32},
 u'Bugzilla Development Meeting']

So I converted my sample events.json file like that:

$ l -h events*
-rw-r--r--  1 peterbe  wheel   674K Aug  8 14:08 events.json
-rw-r--r--  1 peterbe  wheel   423K Aug  8 15:06 events.optimized.json

Excitingly the file is now 250Kb smaller because it no longer contains all those keys.

Now, I'd also send the order of the keys so I could do something like this in the AngularJS code:

 .success(function(response) {
   events = []
   response.events.forEach(function(event) {
     var new_event = {}
     response.keys.forEach(function(key, i) {
       new_event[k] = event[i]
     })
   })
 })

Yuck! Nested loops! It was just getting more and more complicated.
Also, if there are keys that are not present in every element, it means I'd have to replace them with None.

At this point I stopped and I could smell the hackish stink of sulfur of the hole I was digging myself into.
Then it occurred to me, gzip is really good at compressing repeated things which is something we have plenty of in a document store type data structure that a list of dicts is.

So I packed them manually to see what we could get:

$ apack events.json.gz events.json
$ apack events.optimized.json.gz events.optimized.json

And without further ado...

$ l -h events*
-rw-r--r--  1 peterbe  wheel   674K Aug  8 14:08 events.json
-rw-r--r--  1 peterbe  wheel    90K Aug  8 14:20 events.json.gz
-rw-r--r--  1 peterbe  wheel   423K Aug  8 15:06 events.optimized.json
-rw-r--r--  1 peterbe  wheel    81K Aug  8 15:07 events.optimized.json.gz

Basically, all that complicated and slow hoopla for saving 10Kb. No thank you.

Thank you gzip for existing!

MozTrap is what's called a "test case management system". Basically, software QA people need a structure and pattern to their testing. What to test, what versions to test on and what hardware/operatting system etc all is part of a "test suite". That's what MozTrap manages.

So this project was built by Mozilla's automation and tools team. It is currently not an actively developed project. Not because it's not needed or used but because it basically maps all the features we need. A large part of the code base was originally written by a personal friend of mine who I respect wholeheartedly; Carl Meyer of Django/pip/virtualenv/etc fame. I'm grateful for the awesome documentation he left behind amongst many other things.

Together with the team we sat down and listed all the biggest pain points as of today. Basically, the number one thing is speed. Pages load too slowly. Normally when web developers worry themselves with web performance it's a matter of shaving milliseconds off a page where a clients perception equals lost or gained profits. Here's not a problem of milliseconds but a problem of seconds. After some quick poking around on the production site and looking at some code the conclusion is simple: The site is so darn slow because the HTML sent from the server is way too MASSIVE. And baked into that is a mixture of the poor web server having to produce a massive HTML blob and it being sent over the wire.

One test run I made said it took 14 seconds to render a certain page.

Why is it so slow?

So how did this happen and why is it not Carl's fault? :) The reason it happened was because of the underestimated number of options added to the advanced filtering drop-downs. On a local dev version you never notice these things because you set up some options, for example tags, and the drop-down never gets larger than 10-20 options. For example, the "Creator" drop-down today has 1,664 different choices.

If you take all those choices and turn thing into a HTML like this: <option value="1">Adam</option>\n<option value="2">Bram</option>... etc. you get 66Kb of just HTML. However, MozTrap doesn't work like that. Instead it uses pretty drop-downs that don't look like regular HTML drop-downs. See for yourself; go to https://moztrap.mozilla.org/results/runs/ and click the "Advanced Filtering" button.
So, that means that the HTML for each option instead looks like this:

<li class="filter-item">
  <input name="filter-creator" data-name="creator" value="1" id="id-filter-creator-1" class="check" type="checkbox">
  <span class="onoff">
    <label for="id-filter-creator-1" class="onoffswitch">Adam</label>
                <span class="pinswitch"></span>
    <span class="content" title="creator: Adam">Adam</span>
  </span>
</li>

Now you get 620Kb of just HTML just for the "Creators" field. Granted, that is the biggest field of all the drop-downs but lots of them are massive.

So, this makes the page weigh a total of about 1.1Mb just for the HTML. Not only is it a lot of work for the (Django) server to generate this but it's also a heck of a lot of data to send across the Internet on every page request.

So what was the solution?

An ideal solution would have been a significant re-write whereby much of the values of the page gets rewritten as later AJAX calls. I.e. load a skeleton that loads superfast, and then load some AJAX in the background. That AJAX could potentially be cached in the browser with localStorage or something so that you get something to show very quickly whilst you wait for the AJAX request to complete. But this would have been too big a change and the way the filtering works on these pages, you actually need all the options in the drop-downs on immediate load because of the way "pinned filters" work.

So the solution was to replace all the repeated HTML chunks with 1 JSON string and then a piece of Javascript template rendering. So, in the Django template code instead of:

{% for field in filters %}
  {% include "lists/_filter_group.html" with advanced=1 prefix="filter" pinable=1 %}
{% endfor %}

We now replace this with:

<script>
var FILTERS = {% filterset_to_json filters with advanced=1 prefix="filter" pinable=1 %}
</script>
<script id="filter_group" type="text/html">
<section class="filter-group {{ field.cls }}" data-name="{{ field.key }}">
  <h5 class="category-title">
    {{ _field_name_lower }}
    {{# field.switchable }}
    ...

What that basically is is some Mustache code that I use to generate the HTML DOM nodes and insert into the page after load.

In conclusion

So basically nothing changes. Nothing of the Django view had to change. Visually there's no difference and the same actual user data is sent from the server to the client but just packed in a more optimal way.

There are multiple pages where these massive "Advanced Filtering" options exist but on one page I measured the whole page went from weighing 1.1Mb down to 132Kb.

On Friday I did a Show HN and got featured on the front page for HTML Tree.

Google Analytics
Amazingly, out of the 3,858 visitors (according to Google Analytics today) 2,034 URLs were submitted and tested on the app. Clearly a lot of people just clicked the example submission but out of those 1,634 were unique. Granted, some people submitted more than one URL but I think a large majority of people came up with a URL of their own to try. Isn't that amazing! What a turnout of a Friday afternoon hack (with some Sunday night hacking to make it into a decent looking website).

The lesson to learn here is that the Hacker News crowd is excellent for getting engagement. Yes, there are a lot of blather and almost repetitive submissions but by and large it's a very engaging community. Suck on that those who make fun of HN!

angular-classy, by @DaveJ, is an interesting AngularJS module that you use to get some class structure into your controller. You can check out his page and the documentation there for some basic examples of that it does.

This appeals to me as a Python developer because now my angular code looks more like a Python class. I also like that there's an init function now (similar to python's __init__ I guess) and I also like that you can distinguish between "scope functions" and "local functions". To explain that, consider this:

  // somewhere in a controller 
  ...


  $scope.addSomething = function() {  // used in your template
    if ($scope.some_precondition) {
      reallyAddSomething($scope.firt_name, $scope.last_name);
    }
  };

  function reallyAddSomething(first_name, last_name) {
    // can still use $scope in here
  }

And compare this with angular-classy:

  // somewhere in a controller 
  ...


  addSomething: function() {  // used in your template
    if (this.$scope.some_precondition) {
      this._reallyAddSomething(this.$scope.first_name, this.$scope.last_name);
    }
  },

  _reallyAddSomething = function(first_name, last_name) {
    // can still use this.$scope in here
  },

Basically, the _ prefix makes the function available on this but not attached to the scope. And I think that just makes sense!

So my guttural feeling is all positive about angular-classy. But there is still one big caveat. The mythical "this" in Javascript. It's great but it's kinda clunky too because it rebinds in every sub-scope. The solution to that is to bind things. For example, for a success promise it now has to look like this:

this.$http.get('/some/url')
.success(function(response) {
  this.somethingElseInTheModule(response.something);
}.bind(this));

Anyway, let's compare the before and after of a real project.

Before

controllers.js

After

controllers.js

What do you think? Does it look better? Full diff here

I think I like it. But I need to let it "sink in" a bit first. I think the code looks neater with angular-classy but it's now a new dependency and it means that people who know angular but not familiar with angular-classy would get confused when they are confronted with this code.

UPDATE

I merged the branch. So now this project is classy.

I have now closed issue #2 on github-pr-triage. So, now you can have a dashboad of every GitHub project whose pull requests you care about.

The only format of using just 1 repo works too. E.g. /owner/project) and should hopefully not break anybody's bookmarks. The new format for having multiple repos across (possibly) multiple owners is like this:

owner1:projectA,projectB;owner2:projectX,projectY,projectZ

See screenshot:

A couple of different projects

To set yours up, here's a running instance available on https://prs.paas.allizom.org

grymt is a python tool that takes a directory full of .html, .css and .js and prepares the html for optimial production use.

For a teaser:

  1. Look at the "input"

  2. Look at the "output" (Note! You have to right-click and view source)

So why did I write my own tool and not use Grunt?!

Glad you asked! The reason is simple: I couldn't get Grunt to work.

Grunt is a framework. It's a place where you say which "recipes" to execute and how. It's effectively a common config framework. Like make.
However, I tried to set up a bunch of recipes in my Gruntfile.js and most of them worked well individually but it was a hellish nightmare to get it all to work together just the way I want it.

For example, the grunt-contrib-uglify is fine for doing the minification but it doesn't work with concatenation and it doesn't deal with taking one input file and outputting to a different file.
Basically, I spent two evenings getting things to work but I could never get exactly what I wanted. So I wrote my own and because I'm quite familiar with this kind of stuff, I did it in Python. Not because it's better than Node but just because I had it near by and was able to quicker build something.

So what sweet features do you get out of grymt?

  1. You can easily make an output file have a hash in the filename. E.g. vendor-$hash.min.js becomes vendor-64f7425.min.js and thus the filename is always unique but doesn't change in between deployments unless you change the files.

  2. It automatically notices which files already have been minified. E.g. no need to minify somelib.min.js but do minify otherlib.js.

  3. You can put $git_revision anywhere in your HTML and this gets expanded automatically. For example, view the source of buggy.peterbe.com and look at the first 20 lines.

  4. Images inside CSS get rewritten to have unique names (based on files' modified time) so they can be far-future cached aggresively too.

  5. You never have to write down any lists of file names in soome Gruntfile.js equivalent file

  6. It copies ALL files from a source directory. This is important in case you have something like this inside your javascript code: $('<img>').attr('src', 'picture.jpg') for example.

  7. You can chose to inline all the minified and concatenated CSS or javascript. Inlining CSS is neat for single page apps where you have a majority of primed cache hits. Instead of one .html and one .css you get just one .html and the amount of bytes is the same. Not having to do another HTTP request can save a lot of time on web performance.

  8. The generated (aka. "dist" directory) contains everything you need. It does not refer back to the source directory in any way. This means you can set up your apache/nginx to point directly at the root of your "dist" directory.

So what's the catch?

  1. It's not Grunt. It's not a framework. It does only what it does and if you want it to do more you have to work on grymt itself.

  2. The files you want to analyze, process and output all have to be in a sub directory.
    Look at how I've laid out the files here in this project for example. ALL files that you need is all in one sub-directory called app. So, to run grymt I simply run: grymt app.

  3. The HTML files you throw into it have to be plain HTML files. No templates for server-side code.

How do you use it?

pip install grymt

Then you need a directory it can process, e.g ./client/ (assumed to contain a .html file(s)).

grymt ./client

For more options, check out

grymt --help

What's in the future of grymt?

If people like it and want to add features, I'm more than happy to accept pull requests. Some future potential feature work:

  • I haven't needed it immediately, yet, myself, but it would be nice to add things like coffeescript, less, sass etc into pre-processing hooks.

  • It would be easy to automatically generate and insert a reference to a appcache manifest. Since every file used and mentioned is noticed, we could very accurately generate an appcache file that is less prone to human error.

  • Spitting out some stats about number bytes saved and number of files reduced.

Screenshot
Buggy is a singe-page webapp that relies entirely on the Bugzilla Native REST API. And it works offline. Sort of. I say "sort of" because obviously without a network connection you're bound to have outdated information from the bugzilla database but at least you'll have what you had when you went offline.

When you post a comment from Buggy, the posted comment is added to an internal sync queue and if you're online it immediately processes that queue. There is, of course, always a risk that you might close a bug when you're in a tunnel or on a plane without WiFi and when you later get back online the sync fails because of some conflict.

The reason I built this was partly to scratch an itch I had ("What's the ideal way possible for me to use Bugzilla?") and also to experiment with some new techniques, namely AngularJS and localforage.

Live-search

So, the way it works is:

  1. You pick your favorite product and components.

  2. All bugs under these products and components are downloaded and stored locally in your browser (thank you localforage).

  3. When you click any bug it then proceeds to download its change history and its comments.

  4. Periodically it checks each of your chosen product and components to see if new bugs or new comments have been added.

  5. If you refresh your browser, all bugs are loaded from a local copy stored in your browser and in the background it downloads any new bugs or comments or changes.

  6. If you enter your username and password, an auth token is stored in your browser and you can thus access secure bugs.

I can has charts

Pros and cons

The main advantage of Buggy compared to Bugzilla is that it's fast to navigate. You can instantly filter bugs by status(es), components and/or by searching in the bug summary.

The disadvantage of Buggy is that you can't see all fields, file new bugs or change all fields.

The code

The code is of course open source. It's available on https://github.com/peterbe/buggy and released under a MPL 2 license.

The code requires no server. It's just an HTML page with some CSS and Javascript.

Everything is done using AngularJS. It's only my second AngularJS project but this is also part of why I built this. To learn AngularJS better.

Much of the inspiration came from the CSS framework Pure and one of their sample layouts which I started with and hacked into shape.

The deployment

YSlow
Because Buggy doesn't require a server, this is the very first time I've been able to deploy something entirely on CDN. Not just the images, CSS and Javascript but the main HTML page as well. Before I explain how I did that, let me explain about the make.py script.

I really wanted to use Grunt but it just didn't work for me. There are many positive things about Grunt such as the ease with which you can easily add plugins and I like how you just have one "standard" file that defines how a bunch of meta tasks should be done. However, I just couldn't get the concatenation and minification and stuff to work together. Individually each tool works fine, such as the grunt-contrib-uglify plugin but together none of them appeared to want to work. Perhaps I just required too much.

In the end I wrote a script in python that does exactly what I want for deployment. Its features are:

  • Hashes in the minified and concatenated CSS and Javascript files (e.g. vendor-8254f6b.min.js)
  • Custom names for the minified and concatenated CSS and Javascript files so I can easily set far-future cache headers (e.g. /_cache/vendor-8254f6b.min.js)
  • Ability to fold all CSS minified into the HTML (since there's only one page, theres little reason to make the CSS external)
  • A Git revision SHA into the HTML of the generated ./dist/index.html file
  • All files in ./client/static/ copied intelligently into ./dist/static/
  • Images in CSS to be given hashes so they too can have far-future cache headers

So, the way I have it set up is that, on my server, I have a it run python make.py and that generates a complete site in a ./dist/ directory. I then point Nginx to that directory and run it under http://buggy-origin.peterbe.com. Then I set up a Amazon Cloudfront distribution to that domain and then lastly I set up a CNAME for buggy.peterbe.com to point to the Cloudfront distribution.

The future

I try my best to maintain a TODO file inside the repo. That's where I write down things to come. (it's also works as a changelog) since I also use this file to write down what's been done.

One of the main features I want to add is the ability to add bugs that are outside your chosen products and components. It'll be a "fake" component called "Misc". This is for bugs outside the products and components you usually monitor and work in but perhaps bugs you've filed or been assigned to. Or just other bugs you're interested in in general.

Another major feature to work on is the ability to choose to see more fields and ability to edit these too. This will require some configuration on the individual users' behalf. For example, some people use the "Target Milestone" a lot. Some use the "Importance" a lot. So, some generic solution is needed to accomodate all these non-basic fields.

And last but not least, the Bugzilla team here at Mozilla is working on a very exciting project that allows you to register a certain list of bugs with a WebSocket and have it push to you as soon as these bugs change. That means that I won't have to periodically query bugzilla every 30 seconds if certain bugs have changed but instead get instant notifications when they do. That's going to be major! I confidently speculate that that will be implemented some time summer this year.

Give it a go. What are you waiting for? :) Go to http://buggy.peterbe.com/, pick your favorite products and components and try to use it for a week.

For people familar with AngularJS, it's almost frighteningly easy to make a live-search on a repeating iterator.

Here's such an example: http://jsfiddle.net/r26xm/1/

Out of the box it just works. If nothing is typed into the search field it returns everything.

A big problem with this is that the pattern matching isn't very good. For example, if you search for ter you get Teresa and Peter.
More realistically you want it to only match with a leading word delimiter. In other words, if you type ter you want it only to match Teresa but not Peter because Peter doesn't start with ter.
So, to remedy that we construct a regular expression on the fly with a leading word delimiter. I.e. \bter.

Here's an example of that: http://jsfiddle.net/f4Zkm/2/

Now, there's a problem. For every item in the list the regular expression needs to be created and compiled which, when the list is very long, can become incredibly slow.
To remedy that we use $scope.$watch to create a local regular expression which only happens once per update to $scope.search.

Here's an example of that: http://jsfiddle.net/f4Zkm/4/

That, I think, is a really good pattern. Unfortunately we've left the simplicity but we now have something snappier.

Unfortunately the example is a little bit contrived because the list of names it filters on is so small but the list could be huge. It could also be that we want to make a more advanced regular expression. For example, you might want to allow multiple words to match so as ter ma should match Teresa Mayers, John Mayor and Maria Connor. Then you could make a regular expression with something like \b(ter|ma).

For seasoned Angularnauts this is trivial stuff but it really helped me make an app much faster and smoother. I hope it helps someones else doing something similar.

I looked around for Javascript libs that do automatic input formatting for credit card inputs.

The first one was formatter.js which looked promising but it weighs over 6Kb minified and also, when you apply it the placeholder attribute you have on the input disappears.

So, in true software engineering fashion I wrote my own:

function cc_format(value) {
  var v = value.replace(/\s+/g, '').replace(/[^0-9]/gi, '')
  var matches = v.match(/\d{4,16}/g);
  var match = matches && matches[0] || ''
  var parts = []
  for (i=0, len=match.length; i<len; i+=4) {
    parts.push(match.substring(i, i+4))
  }
  if (parts.length) {
    return parts.join(' ')
  } else {
    return value
  }
}

And some tests to prove it:

assert(cc_format('1234') === '1234')
assert(cc_format('123456') === '1234 56')
assert(cc_format('123456789') === '1234 5678 9')
assert(cc_format('') === '')
assert(cc_format('1234 1234 5') === '1234 1234 5')
assert(cc_format('1234 a 1234x 5') === '1234 1234 5')

Check out the Demo