microformats

Liminal Existence

Clouds in Iceland

Thursday, April 29, 2010

This blog has moved

This blog is now located at http://blog.romeda.org/. You will be automatically redirected in 30 seconds or you may click here. For feed subscribers, please update your feed subscriptions to http://blog.romeda.org/feeds/posts/default.

Monday, April 26, 2010

Identity

The web is facing a serious identity crisis. Many have written about it but, having thought a lot about this problem over the past few years, I can't help but feel that something important has been missed in most discussions.

Aza Raskin cuts to the heart of the matter:

“Your identity is too important to be owned by any one company.
Your friends are too important to be owned by any one company.”

I'll go one step further, and say that the centralisation of identity is stifling innovation on the web. Kellan recounts a quote from a friend, on the subject of Facebook's f8 announcements: “Well, [Facebook] gave Foursquare a 6 month reprieve.” This is not a hopeful view.

So what's the alternative? What's missing from the conversation? I think it's important to take a view from the perspective of usability; how are we going to use this new conception of identity? The answer is simple: exactly as we do today. The fundamentals of social networking haven't changed from day one. There's a website, you log on, and you add friends so that you can share content with them. This basic model applies to every successful social internet technology, from email to IM to Friendster to Facebook to FourSquare.

Logging On

Logging on is easy. Whether it's a password, OpenID, @anywhere, Firefox Contacts, or Facebook Connect, the principle remains the same: the user proves to their server that they are a particular individual. In that sense, any protocols or approaches beyond username and password are just icing on the cake. They're ways to make logging in easier, to increase conversion rates (at least, in theory), but they don't fundamentally change what we can do on the web.

So once we've logged on, how do we add friends?

Adding Friends

This is the part that's missing from the conversation. How does it work today? Well, you have two options: either you find someone's profile page and click "add" (which doesn't work across sites, or if you can't find your friend's profile page), or you find other users on the site by entering their email address. Often the latter approach is achieved by the site opening up your email address book and looking for friends in bulk, but fundamentally it boils down to "Find email address of friend, search users database for that email address, add friend."

OpenID doesn't help, because I don't even know my OpenID URL, let alone my friends' OpenID URLs. The "ID" in OpenID is a bit of a misnomer. OAuth doesn't help – Twitter and Facebook use OAuth under the covers for @anywhere and Facebook Connect, but that only helps me, the site using @anywhere/FB Connect, and Twitter and/or Facebook themselves. It also doesn't help us get away from Aza's point, that your identity remains in the hands of a single company.

The solution, as I said to Tim O'Reilly, is Webfinger. Indeed, webfinger was born out of struggling with exactly this problem of representing multi-faceted identity on the web in a way that can't be controlled by any one company. The approach is essentially to invert the currently-closed user databases, and put social network affiliation in the hands of the users, in our hands, all while keeping the user experience the same as with current and past social software.

I won't go into the technical details here, but in essence the workflow from a site builder's perspective looks like this:

  1. Kate (kate@inktopaper.com) meets Fiona (fiona_z_342@gmail.com) at APE, and the two exchange email (re: webfinger) addresses.
  2. Kate wants to stay in touch with Fiona, so she logs on to her social network of choice ("ZineSpace"), and enters Fiona's address.
  3. ZineSpace supports photos, microblogging, and calendaring, and discovers via a webfinger lookup that Fiona has a photostream at Sketchr, tweets at identi.ca, but doesn't share a calendar. ZineSpace uses PubSubHubbub to send subscribe requests to Sketchr and identi.ca on behalf of kate@inktopaper.com.
  4. Sketchr and identi.ca look up Kate's webfinger profile, and use her published photo to show the incoming friend/follow request to Fiona.
  5. Fiona uses her identi.ca microblog for work, so she declines that invitation, but approves the friend request on Sketchr, and adds Kate as a friend (asymmetric follow) on Sketchr.
  6. Even though she declined the identi.ca request, Fiona wants to keep up with Kate's life, and has a personal Tweetter account that's not published on her public webfinger profile. She logs in there and adds fiona_z_342@gmail.com as a friend.
  7. Tweetter sends a subscribe request to ZineSpace (discovered via the webfinger profile) on behalf of kate@inktopaper.com. ZineSpace knows that Fiona wants to see Kate's microblog posts, so auto-approves the request and sends a reverse-follow request (which is again automatically approved).

Now, five things are important to keep in mind when thinking about this process:

  1. Shareable Addresses are what make this exchange possible. Kate and Fiona can't be reasonably expected to remember all of their various profile URLs, nor can they be expected to remember each other's profile URLs. Their webfinger addresses act as mnemonics for their distributed identities.
  2. Webfinger is just a discovery mechanism. HTTP remains the transport mechanism for this approach, which means that everyone can participate.
  3. We have not exposed any personal information about Kate or Fiona, and more importantly, we haven't exposed any information about Kate or Fiona's relationships. They can do that if they wish (e.g., by linking to a FoAF or XFN profile from their Webfinger profiles), but the approach doesn't make any assumptions about what information must or must not be shared.
  4. Subscription = Relationship. The underlying approach doesn't say what kind of relationship the two are creating, but rather allows protocols or data transports on top of the exchange to do so. Fiona could decide that she doesn't like Kate's sketches, or that Kate posts too much, and simply tell Sketchr to hide Fiona's posts. As far as Kate's concerned, Fiona is still subscribed, and still viewing her photos. Alternatively, Fiona could send an unsubscribe request to Kate, signalling that the relationship no longer exists. The semantics are up to the application developer at either end.
  5. No Passwords are involved in the exchange. Kate and Fiona don't need to exchange PGP keys, either, or rely on some complicated "Web Of Trust." All they need to do is trust the servers they use (zinespace, sketchr, identi.ca, and their email providers); if they can't do that, then they have bigger problems. This one's really important, because RSS and Atom were meant to be the future of content exchange. RSS and Atom have completely failed to enable private feeds, because they require passwords to do so. That sucks, and webfinger offers a workable solution to this persistent problem.

Sharing Content

Once relationships have been established, sharing content is entirely up to the individuals and sites involved. Obviously, we need common protocols for this, but we can use Atom and PubSubHubbub, some domain-specific protocol (e.g., Portable Contacts/ActivityStreams), or we'll figure it out as we go along. Direct or private messages are just special forms of general distribution content (i.e., the subscriber has asked to receive content from the publisher, whether that content was broadcast or is a directed message).

There are a lot of things that I haven't covered in this post. The technology is simple, but not trivial, and it is still very new. There aren't yet tools that make this easy (if you'd like to work on building them, contact me!)

Saturday, January 30, 2010

Hot Code Loading in Node.js

Reading through Fever today, this post by Jack Moffitt caught my eye. In it, he discusses a hack to allow a running Python process to dynamically reload code. While the hack itself, shall we say, lacks subtlety, Jack's post got me thinking. It's true, Erlang's hot code loading is a great feature, enabling Erlang's 99.9999999% uptime claims. It occurred to me that it wouldn't be terribly difficult to implement for node.js' CommonJS-based module loader.

A few hours (and a tasty home-made Paella later), here's my answer: Hotload node branch.

Umm… What does it do?

var requestHandler = require('./myRequestHandler');

process.watchFile('./myRequestHandler', function () {
  module.unCacheModule('./myRequestHandler');
  requestHandler = require('./myRequestHandler');
}

var reqHandlerClosure = function (req, res) {
  requestHandler.handle(req, res);
}

http.createServer(reqHandlerClosure).listen(8000);

Now, any time you modify myRequestHandler.js, the above code will notice and replace the local requestHandler with the new code. Any existing requests will continue to use the old code, while any new incoming requests will use the new code. All without shutting down the server, bouncing any requests, prematurely killing any requests, or even relying on an intelligent load balancer.

Awesome! How does it work?

Basically, all node modules are created as sandboxes, so that as long as you don't use global variables, you can be sure that any modules you write won't stomp on others' code, and vice versa, you can be sure that others' modules won't stomp on your code.

Modules are loaded by require()ing them and assigning the return to a local variable, like so:

var http = require('http');

The important insight is that the return value of require() is a self-contained closure. There's no reason it has to be the same each time. Essentially, require(file) says "read file, seal it in a protective case, and return that protective case." require() is smart, though, and caches modules so that multiple attempts to require() the same module don't waste time (synchronously) reading from disk. Those caches don't get invalidated, though, and even though we can detect when files change, we can't just call require() again, since the cached version takes precedence.

There are a few ways to fix this, but the subtleties rapidly complicate matters. If the ultimate goal is to allow an already-executing module (e.g., an http request handler) to continue executing while new code is loaded, then automatic code reloading is out, since changing one module will change them all. In the approach I've taken here, I tried to achieve two goals:

  1. Make minimal changes to the existing node.js require() logic.
  2. Ensure that any require() calls within an already-loaded module will return functions corresponding to the pre-hot load version of the code.

The latter goal is important because a module expects a specific set of behaviour from the modules on which it depends. Hot loading only works so long as modules have a consistent view of the world.

To accomplish these goals, all I've done is move the module cache from a global one into the module itself. Reloading is minimised by copying parent's caches into child modules (made fast and efficient thanks to V8's approach to variable handling). Any module can load a new version of any loaded modules by first removing that module from its local cache. This doesn't affect any other modules (including dependent modules), but will ensure that any sub-modules are reloaded, as long as they're not in the parent's cache.

By taking a relatively conservative approach to module reloading, I believe this is a flexible and powerful approach to hot code reloading. Most server applications have a strongly hierarchical code structure; as long as code reloading is done at the top-level, before many modules have been required, it can be done simply and efficiently.

While I hope this patch or a modified one will make it into node.js, this approach can be adapted to exist outside of node's core, at the expense of maintaining two require() implementations.

Labels: , , ,