Skip navigation

Tag Archives: AppEngine

[ Note: this post was originally started in September/November, 2008. It’s been sitting in my drafts for quite a while and I decided it was time to finish it up. ]

We finished off a short 1 week development sprint last Friday [edit: sometime in Sep/Nov 2008] which culminated in the current live version of the VendAsta website.  Despite such a short dev-cycle, I feel we were able to accomplish most of our research goals.

One of the tools we used that I didn’t have much experience with prior was Yahoo Pipes. We had somewhat of a unique problem to solve; we wanted to be able to draw from a variety of Feed sources that would filter through to different sections of the site. Before I dove into using Pipes my gut feeling was just to use some sort of feed parsing library in Python that was compatible with AppEngine. One of the requirements for implementing the feed sources was that it be very easy to update should a feed need to be changed. If we stuck with the Python library implementation and statically coded some feed sources, changes would take significantly more overhead to implement.

We started investigating Pipes as a platform to aggregate all of the content we wanted to re-publish and were impressed by what we found. It turns out that the Pipes interface is much better at providing an easy-to-update set of feed sources for non-programmers. If you haven’t looked at the Pipes interface I highly recommend you do so. I think Pipes offers so much advanced filtering and splicing functionality in a really tight package, so I’ll attempt to break down some of the modules we used specifically for the VendAsta website.

The image above is a shot of the Pipes interface, and more specifically the Pipe that we use to aggregate all of the feed-based content on the VendAsta website. I should clarify, by feed-based content I’m referring to all of the official employee, product, and corporate blogs. Yes you read that right, employee blog posts (when tagged appropriately) end up on our corporate website. But let’s set aside the trust/ethical issues related to that and focus on the technology we’re using. There are a few components to the root pipe we use to collect all this content, and the Pipes documentation isn’t always entirely clear on the best way to use things, so I will attempt to decompose it and discuss each module in detail.

Feed Auto-Discovery

We have a URL mounted on VendAsta.com that provides only 1 function: spitting out a list of link tags in the following format:

<link rel="alternate" type="application/rss+xml" title="Allan Wolinski" href="http://www.orthodrome.ca">

Internally, we store a list of all employees with some meta information like blogURL, linkedinURL etc. This url is what we provide to the auto-discovery module so it can do it’s magic and go out to all of the rss+xml links and gather a list of feeds from the blogs. Here’s a sample of the output you get from this URL in the Pipes debug console.

Once we have this list of blogs and the links to them, we can pass the output from the auto-discovery module on to the Loop + Fetch Site Feed modules.

Loop + Fetch Site Feed

The loop module is pretty self explanatory, you use it when you want to iterate over a group of results and operate on them somehow. It’s kind of similar to providing an AJAX onSuccess callback; you send out a request for the content (the iteration) and you define a ‘callback’ by dragging a supporting pipes function into the available box to process each item. In our case, we’re using the Fetch Site Feed module from the toolbox to snag a feed from each site in our blogroll. We configure the module to emit all results to our next module, the Filter, which does most of the magic in determining which content ends up on the corporate website.

Pipes Loop + Fetch Site Feed Modules

Filter

The filter module works really well for grabbing a set of elements based on a certain set of criteria; in our case we use the Regexp option to explicitly allow any content with the tag/category ‘VendAsta’.

The Filter Module

Union, Sort, Rename + Pipe Output

The final layer of our pipe involves combining results from some other static sources independent of filters (VendAsta blog and the StepRep blog), sorting them based on the date published descending, renaming a specific RSS feed field (item.dc:creator) to ‘author’ to fix a problem in the feed reading library we’re using server side (feedparser.py) and finally the output of all these operations: ‘Pipe Output’ module. All these modules are really easy to string together with the Pipes interface and are clickable at any point during the creation which will toggle the debug output to stop at the currently selected module. This is very useful for seeing what the modules you are using do at each step of the process.

Union Sort Rename and Pipe Output Modules

Pipes as abstract components ties it all together

So now that we have a completed Pipes module, we can do a number of things with it. We can get the output in RSS or JSON format, define a custom url for the pipe, and even give it a name and make it public so other people can use it in their mashups. However, the coolest feature IMHO is that you can use any created pipes module as an abstracted pipes component on it’s own which can be part of another pipes component that you can then perform further operations on. We chose to separate a few of the layers in our feed hierarchy this way for simplicity sake and to keep it loosely coupled if we wanted to make changes to what feeds appear on certain pages. Here’s an example of our previous Blogroll pipe being used as input for further filtering and display on 1 section of the VendAsta website: the Projects page.

The blogroll pipe as input

Brett and I recently began refactoring a significant amount of the JavaScript that is currently in MyFrontSteps and Homebook. One of the areas we identified as needing improvement was controlling when scripts get loaded in the page; it’s a challenging subject especially when utilizing Django templates which can extend and include bits of HTML that are both static and dynamic. We’re not finished the refactoring quite yet but I thought it would be valuable to blog about the lessons we’ve learned early on about how to manage JavaScript loading without having script tags all over the place.

Manual Dependency Management is Hard

Not too long ago, Yahoo put out a list of guidelines that web developers can use to enhance the performance of their websites. I won’t get into it in this post but there is a Firefox plugin called YSlow available that can automate some of the performance checking. The recommendation I’m going to focus on here is the one regarding moving scripts as close to the bottom of the page as possible. I’ll just quote the Yahoo doc as they explain it very clearly:

The problem caused by scripts is that they block parallel downloads. The HTTP/1.1 specification suggests that browsers download no more than two components in parallel per hostname. If you serve your images from multiple hostnames, you can get more than two downloads to occur in parallel. While a script is downloading, however, the browser won’t start any other downloads, even on different hostnames.

In some situations it’s not easy to move scripts to the bottom. If, for example, the script uses document.writeto insert part of the page’s content, it can’t be moved lower in the page. There might also be scoping issues. In many cases, there are ways to workaround these situations.

In the case of the code we’re using on MyFrontSteps we had all kinds of Django templates with includes and templatetags that include other little bits of dynamic JavaScript that was adding markup to the DOM and manipulating DOM content. Some of this code would appear prior to certain dependent scripts being loaded which created a real nightmare for us as we had to manually track down the position of the dependent scripts in the page that was fed to the browser after a variable number of layers of templates and includes. We decided to research a better way to manage all these scripts and since we had been using the YUI library for many other parts of the website we settled on the YUI loader.

YUI Loader Makes Dependency Management Easy

The YUI Loader Utility is a client-side JavaScript component that allows you to load specific YUI components and their dependencies into your page via script. YUI Loader can operate as a holistic solution by loading all of your necessary YUI components, or it can be used to add one or more components to a page on which some YUI content already exists.

The great thing about the YUI Loader is that you can use it to manage dependency sorting for all of the YUI components you use on a page but the killer feature imho is the fact that it allows you to create custom modules to load your own JS and CSS libraries. The basic pattern we’re using in our root level Django template is as follows:

  1. Load all CSS Files at the top of the template
  2. Place all YUI Loader and other statically included scripts at the bottom of the page prior to the closing BODY tag
  3. Define the YUI Loader instance
  4. Define custom modules for the YUI loader
  5. Provide a Django template block to allow manipulation of the onSuccess callback
  6. Call the YUI Loaders .insert() method to trigger insertion of all dependency sorted scripts into the DOM

The advantage to using the loader in conjunction with our root Django template is clear; all of our scripts are loaded at the bottom of the page and CSS is loaded at the top. When a user requests the pages they will start receiving the content from the server immediately and not have to wait for scripts to process as the loader dynamically inserts them into the HEAD element after the page content has been rendered. Another trick we’re using to control when scripts get executed is by manipulating how the onSuccess callback of the loader works. Here’s the code:

// Instantiate and configure YUI Loader:
MFSLoader = new YAHOO.util.YUILoader({
    base: "",
    require: ["base","reset-fonts-grids","MFS"],
    loadOptional: false,
    combine: true,
    filter: "MIN",
    allowRollup: true,
    onSuccess: function() {
        while( MFSLoader.onReady.length )  {
            MFSLoader.onReady.shift()();
        }
    }
});

// Define a list of executables that we
// can add to prior to page load completion
MFSLoader.onReady = [];

// Custom Modules for Loader
MFSLoader.addModule({
    name: 'MFS', type: 'js', varName: 'MFS',
    path: '{% vurl "/static/script/MFS.js" %}',
    requires: ['jQuery', 'jQueryUI']
});

{# Override this block if you need script at the global level. #}
{% block global.script %}{% endblock %}
// Trigger insertion of all dependency sorted scripts into the DOM
MFSLoader.insert();

As you can see we have a few of the custom modules defined here, which include a ‘requires’ attribute in the configuration object. This lets the YUI Loader know that these files require the other modules to be loaded. The Loader then determines sort order for inclusion based on all required modules and goes to work inserting the scripts for you.

Tying it all Together

The key to making this work in the varying levels of Django templates is to manipulate the onSuccess callback. When the YUI Loader finishes loading all the dependency sorted scripts it will execute this callback and allow you to then execute any additional code that depends on the dynamically inserted scripts. We have a number of namespaced JS objects for MyFrontSteps (MFS.js, MFS.DataGrid.js etc..) and when we need to call functions defined in these objects we do so by extending the global.script block in our child template and pushing a new anonymous function onto the MFSLoader.onReady list we defined in the root template. Here’s a simple example:

{# A CHILD TEMPLATE #}
{% block global.script %}
    #include the parent templates global.script contents
    {{ block.super }} 

    // Load our required features
    MFSLoader.require( 'MFS.DataGrid', 'uploader', 'MFS.Uploader' );
    // Push an anonymous function onto the list
    // to get executed when all required scripts have been loaded
    MFSLoader.onReady.push( function()  {
        MFS.DataGrid.initPhotosGrid();
    });
{% endblock %}

Once the page is rendered and the YUI Loader has finished parsing the loaded scripts the code we have in the onSuccess callback pops all the anonymous functions off the list which causes them to be executed.

{# THE ONSUCCESS CALLBACK #}
onSuccess: function() {
    while( MFSLoader.onReady.length )  {
        MFSLoader.onReady.shift()();
    }
}

We still have a number of files to refactor, but the performance benefits we’ve seen so far using this approach have really made us confident that this is the way to go. Using a dependency manager that works for custom JS / CSS libraries is just so much more liberating than having to manually keep track of where scripts are getting executed. Because we’re also migrating all of our code to external library files it makes things that much easier to debug in Firebug as well. 🙂