microformats

Liminal Existence

Clouds in Iceland

Sunday, April 22, 2007

Slides for Scaling Twitter talk, XTech next up.

The slides from the talk are available on SlideShare. They don't spell out the talk, so if you have questions, please do ask.

I'm really excited about XTech, coming up in a few weeks (May 15th-18th) in Paris. I'll be speaking with Kellan about using Jabber to build Social Software for Robots. Lots of awesome people and talks there, should be a lot of fun.

Labels:

Friday, April 13, 2007

Scaling Twitter, The Talk.

Simon Willison linked to an interview with Alex Payne, one of my co-workers on Twitter. This caused a bit of a stir, so apparently there's some interest in our experience scaling Twitter, and Rails.

We've been extremely happy with Rails, and make use of the multitude of helpers that it offers us - like any application on any stack, though, providing fast response times to a (rapidly) growing number of users is a challenge. The solutions are often tightly coupled to the application and its characteristics, and while scaling the most trafficked Rails site in the world, we've run into situations where existing solutions weren't enough.

This process has led us to build a number of tools that help us deal with our load, and just as soon as we find some spare time, we'll be releasing many of them. In the meantime, you can find out first what sorts of challenges we've encountered and solutions we've come up with at my talk at the SDForum Silicon Valley Ruby Conference next weekend (April 21-22nd).

I'll be focusing on ActiveRecord and database optimization, caching, and of course, Messaging. I'll also touch on some areas where we haven't had great successes (yet), and hopefully someone from the audience will shout out that there's some totally obvious and awesome thing that we haven't thought of, and it'll save us weeks of work (no, I'm serious. Does someone want to take bets?).

Labels:

Sunday, April 01, 2007

MapReduce in 36 lines of Ruby

This has been burning a hole in my head since August, after Joel's post made it blindingly obvious that Ruby is the perfect language for distributed programming. I have some code that properly implements partitioning, etc, but never got around to finishing it sufficiently for a proper release. Here's the core idea; if anyone wants the partitioning code, ping me at romeda@gmail.com. mapreduce_enumerable.rb:

require 'rubygems'
require 'ringy_dingy'
require 'ruby2ruby'

module Enumerable
  def dmap(&block)
    self.each_with_index do |element,idx|
      ring_server.write([:dmap, Process.pid, block.to_ruby, element, idx])
    end

    results = []
    while results.size < self.size
      result, idx = ring_server.take([:dmap, Process.pid, nil, nil]).last(2)
      results[idx] = result
    end

    results
  end

  def ring_server
    return @ring_server if @ring_server

    ringy_dingy = RingyDingy.new nil
    @ring_server = ringy_dingy.ring_server
  end
end
mapreduce_runner.rb:

require 'rubygems'
require 'ruby2ruby'
require 'ringy_dingy'

ringy_dingy = RingyDingy.new nil
ring_server = ringy_dingy.ring_server

loop do
  pid, block, element, idx = ring_server.take([:dmap, nil, nil, nil, nil]).last(4)
  begin
    result = eval(block).call(element)
  rescue Object => err
    result = err
  end
  puts "Got #{result} from #{element} for #{pid}."
  ring_server.write([:dmap, pid, result, idx])
end
From the shell:
$ sudo gem install RingyDingy
$ sudo gem install ruby2ruby
$ ring_server &
$ ruby mapreduce_runner &
$ ruby mapreduce_runner &
From irb:

> require 'mapreduce_enumerable'
> (1..100).to_a.dmap { |v| v * 2 }

Labels: