microformats

Liminal Existence

Clouds in Iceland

Saturday, January 30, 2010

Hot Code Loading in Node.js

Reading through Fever today, this post by Jack Moffitt caught my eye. In it, he discusses a hack to allow a running Python process to dynamically reload code. While the hack itself, shall we say, lacks subtlety, Jack's post got me thinking. It's true, Erlang's hot code loading is a great feature, enabling Erlang's 99.9999999% uptime claims. It occurred to me that it wouldn't be terribly difficult to implement for node.js' CommonJS-based module loader.

A few hours (and a tasty home-made Paella later), here's my answer: Hotload node branch.

Umm… What does it do?

var requestHandler = require('./myRequestHandler');

process.watchFile('./myRequestHandler', function () {
  module.unCacheModule('./myRequestHandler');
  requestHandler = require('./myRequestHandler');
}

var reqHandlerClosure = function (req, res) {
  requestHandler.handle(req, res);
}

http.createServer(reqHandlerClosure).listen(8000);

Now, any time you modify myRequestHandler.js, the above code will notice and replace the local requestHandler with the new code. Any existing requests will continue to use the old code, while any new incoming requests will use the new code. All without shutting down the server, bouncing any requests, prematurely killing any requests, or even relying on an intelligent load balancer.

Awesome! How does it work?

Basically, all node modules are created as sandboxes, so that as long as you don't use global variables, you can be sure that any modules you write won't stomp on others' code, and vice versa, you can be sure that others' modules won't stomp on your code.

Modules are loaded by require()ing them and assigning the return to a local variable, like so:

var http = require('http');

The important insight is that the return value of require() is a self-contained closure. There's no reason it has to be the same each time. Essentially, require(file) says "read file, seal it in a protective case, and return that protective case." require() is smart, though, and caches modules so that multiple attempts to require() the same module don't waste time (synchronously) reading from disk. Those caches don't get invalidated, though, and even though we can detect when files change, we can't just call require() again, since the cached version takes precedence.

There are a few ways to fix this, but the subtleties rapidly complicate matters. If the ultimate goal is to allow an already-executing module (e.g., an http request handler) to continue executing while new code is loaded, then automatic code reloading is out, since changing one module will change them all. In the approach I've taken here, I tried to achieve two goals:

  1. Make minimal changes to the existing node.js require() logic.
  2. Ensure that any require() calls within an already-loaded module will return functions corresponding to the pre-hot load version of the code.

The latter goal is important because a module expects a specific set of behaviour from the modules on which it depends. Hot loading only works so long as modules have a consistent view of the world.

To accomplish these goals, all I've done is move the module cache from a global one into the module itself. Reloading is minimised by copying parent's caches into child modules (made fast and efficient thanks to V8's approach to variable handling). Any module can load a new version of any loaded modules by first removing that module from its local cache. This doesn't affect any other modules (including dependent modules), but will ensure that any sub-modules are reloaded, as long as they're not in the parent's cache.

By taking a relatively conservative approach to module reloading, I believe this is a flexible and powerful approach to hot code reloading. Most server applications have a strongly hierarchical code structure; as long as code reloading is done at the top-level, before many modules have been required, it can be done simply and efficiently.

While I hope this patch or a modified one will make it into node.js, this approach can be adapted to exist outside of node's core, at the expense of maintaining two require() implementations.

Labels: , , ,

Tuesday, May 05, 2009

Easy Android

Let me start by saying that I'm very impressed with Android, and the ease with which I was able to scratch an itch was impressive. The fact that I'm not locked into Apple's app store world is nice; I don't know what the specific terms are for Google's marketplace (I haven't signed up yet), but frankly I trust them more than I do Apple.

Full Disclosure: at Social Web Foo, Joshua very kindly gave me (and about 50 others) a free Android Dev Phone. He gave strict instructions to actually write something for it, and I probably wouldn't have bought a dev phone (nor written <shudder> the Java against the emulator), so the fact that I have done so worked out.

I hadn't written a line of Java before. Shock, horror. It's just as annoying as everyone always said, but developing for Android with Eclipse is pretty straightfoward. The documentation is good. I started with the basics, and pretty quickly moved to example code. Most notably, the NotePad application that comes bundled with the Android SDK.

Once the Eclipse and the SDK was downloaded and installed, it only took about five minutes to get my first app up and running. Getting a list view of hard-coded data took another ten minutes, and modifying the code to display a different view when one of the items was clicked took about two hours (keep in mind that I was learning Java with the patience of a Ruby programmer here).

The Bad

Honestly, there's not a lot that I was upset by. Coming from Rails, defining table structures and setters and getters and all the explicit typing is pretty annoying. Cutting and pasting code meant that I had a few mismatched data structure definitions, and the error messages were less than useful, since all that fancy type matching means that when something doesn't match up in your XML configuration file, Java can't tell you where to go to fix it. Figuring out where I had mismatched strings in XML config files easily took another couple of hours, which sucked, but it seems like the kind of thing that you'd get used to. Functional brain damage, I suppose.

The Good

Android is a developer's platform. The way that content, and well, everything is addressed is fantastic. There are hooks for everything, and the tutorials encourage you to do the right thing out of the box. The documentation really only makes sense once you get it, but for the really simple app I've been working on, the examples were more than sufficient to get things going.

Basically, every data source is addressable through a "content://" URI scheme. What that means is that any application can provide hooks (if they know about your data source) to view, edit, or list bits of data. I expose a set of short recipes (from @cookbook) at content://org.romeda.provider.Cookbook/recipes. That means that as long as, say, Tweetie knows that there's the provider exists, they could add a hook to allow people to add any tweet as a recipe. It also provides hooks for other applications to know when new recipes are published (or, for example, I could tie into a Twitter provider on the phone and piggyback discovery of new updates to that, rather that running my own polling process), in addition to hooks to view, edit, or any action you can imagine.

The best thing about this is that the whole system works the same way. You register an "Intent", and the OS lets you know when something in the system is relevant. The simplest (but seriously awesome) example of this is if you want to intercept an outgoing call and rewrite the number (or just make the call over a voip stack, for example). You just register your intent to handle ACTION_NEW_OUTGOING_CALL, and away you go. A simple data passing interface lets you receive and attach data to the messages.

The other thing that's great about Android that I noticed right away is that the default views are extremely simple to use and customize, and they save their own state. Without writing any special code to remember where a user is in the scroll buffer, and without doing any work to remember which view a user was in (e.g., list or item view, edit, etc), the default behaviour is to remember. It's the embodiment of everything Google's been doing on the web lately — don't save, ever, because saving is stupid. Either you've published/archived/sent/deleted something, or it's in a draft form. The draft is implicitly persistent, and avoids the user ever losing work. This is in stark contrast to the iPhone, where Safari's horrible constant reloading of pages boggles the mind, and burns through roaming data minutes like there's no tomorrow.

The Code

So it's not fancy, and it doesn't even fetch the recipes yet, but I'm posting the code here since it's pretty damned simple at this point, and demonstrates making an app with two views. I'll update it as time allows.

Labels: , , , , ,

Simple Addressing for the Web, Part 1

Addressing is important. It's something that many people have tried to solve.

I'm interested in addressing because it's an important piece of web-scale messaging, and of the federated social networks that are an emergent property of verified cross-site communication. In order to communicate with someone, you need to be able to route your communications to them.

The URL is the thing. Except when it's not.

The URL was supposed to become the way that we negotiated identity. We were supposed to have a "home page," a place on the internet to call our own. It didn't quite work out that way, and at the same time as Geocities is shutting down, we're finally facing the need for a strong conception of identity on the web.

It goes without saying these days that everything we do, everything we interact with, has an associated URL. I can give you my blog URL so that you can read my posts, or my calendar URL so that you can invite me to events. However, for the vast majority of users, URLs aren't a viable option. Fundamentally, it's a lack of consistency (or, put another way, unbridled diversity) that makes URLs unusable as identity markers. Take the following URLs as a proof-by-example:

  • twitter.com/blaine
  • myspace.com/romeda
  • flickr.com/lattice
  • search.twitter.com/search?q=%22Swine+Flu%22+OR+Flu
  • home.myspace.com/index.cfm?fuseaction=user
  • blogger.com/post-create.g?blogID=6135683561277543562
  • amazon.com/gp/pdp/profile/A1GUHSGP27QA4W

All of the above are URLs which I see while interacting with sites on the web. Unlike postal addressing, phone numbers, or email, there's no consistency. The path part of the domain may as well be line noise in the latter four examples. By association, the pattern used by Flickr, MySpace, and Twitter is a fluke. Beyond that, my username doesn't match across the three social networking sites, and as such it's nearly impossible for a friend, relative, or co-worker to guess what my URL is, even given a domain.

I don't see a way to fix URLs across the web so that we can encourage people to use them as identifiers. OpenID has tried, and the results are nothing short of abysmal.

Back to the Future

If not URLs, what should our new web addressable identities look like? The simplest answer is "like an email address." They're universally recognizable. Billions of people have email addresses and know how to use them. All the major IM providers have moved towards email-like addresses as identifiers (gone are the integers of ICQ). Most importantly, email addresses are easy to construct and resolve.

The net result of this line of thought is that instead of @blaine for my Twitter address, I'd be blaine@twitter.com, and on Identi.ca I'd be blaine@identi.ca. I could share my Myspace identity as romeda@myspace.com, and on Facebook I could be blaine.cook@facebook.com.

The problem is that those addresses conflict with an already-existing namespace, specifically email. Which isn't surprising, but it is problematic. Can you send me an email at blaine@twitter.com or romeda@myspace.com? What happens when you do? Unfortunately, there aren't clear answers for those questions, and while some social networks might choose to make "Social Network Addresses" work as email addresses, it would be an uphill battle to convince all providers to do so.

Use What's Already There

I've been thinking about this problem a lot lately, and while the approach of re-using email semantics to provide human-readable web addresses/identities is very attractive, the proliferation of addresses (one for each network) and namespace collisions are less than ideal. After having extensive conversations with Alexis Richardson and Tony Garnock-Jones, the general approach for discovery became clear to me, but I didn't have a more generally applicable form for the addresses themselves.

Eventually, talking over the problem with John Panzer and Breno de Medeiros at Social Web Foo, the solution was there, blazing as bright as the California Sun; Google Profiles means that Google is now providing links to all my social network profiles. They're also my email provider.

My email address is romeda@gmail.com. If you transform that to http://www.google.com/profiles/romeda, you get my profile data, and away we go. Every email provider these days has a website, and Eran's LRDD, new on the scene, provides a discovery mechanism that everyone (i.e., every mail provider, even if they're only hosting static content) can implement in just a few minutes.

This is an important, exciting transformation. Now, with one identifier, I can share all the social bits of myself to anyone I please.

Where are my photos? romeda@gmail.com.

Where's my calendar? romeda@gmail.com.

What's my phone number? Look it up with romeda@gmail.com, and I'll give you permission to see it and store it in your address book.

They're all the same.

If I want to share a different set of social interactions, say, my work identity, I can give my Osmosoft or BT addresses, blaine@osmosoft.com and blaine.cook@bt.com, respectively. Now just photos of conferences come up, and the calendar that you'll find is my work calendar, not my social calendar.

Talking about this problem with others unearthed a post last year by Brad Fitzpatrick and EAUT, which were both aimed at solving the OpenID problem, but both take the same approach as the one that I outline here. EAUT seems to have been lost in the swamps of XRDS-Simple, and Brad's post was probably too early to the races, in true Brad style (if you want to know what's coming to the internet in five years, just read his blog posts).

With a swift and general agreement-in-principle, there's been some very positive movement towards promoting this concept as a way to bring the power of strong identity that email provides to the web. John has an excellent post on the subject, and it seems like a name for the project has emerged: WebFinger.

Part Two (coming tomorrow) goes in depth about how this all works on the tech side. Bits on the wire, as Tim Bray says.

Labels: , , , , ,