MapReduce in 36 lines of Ruby
This has been burning a hole in my head since August, after Joel's post made it blindingly obvious that Ruby is the perfect language for distributed programming. I have some code that properly implements partitioning, etc, but never got around to finishing it sufficiently for a proper release. Here's the core idea; if anyone wants the partitioning code, ping me at romeda@gmail.com.
mapreduce_enumerable.rb:
require 'rubygems'
require 'ringy_dingy'
require 'ruby2ruby'
module Enumerable
def dmap(&block)
self.each_with_index do |element,idx|
ring_server.write([:dmap, Process.pid, block.to_ruby, element, idx])
end
results = []
while results.size < self.size
result, idx = ring_server.take([:dmap, Process.pid, nil, nil]).last(2)
results[idx] = result
end
results
end
def ring_server
return @ring_server if @ring_server
ringy_dingy = RingyDingy.new nil
@ring_server = ringy_dingy.ring_server
end
end
mapreduce_runner.rb:
require 'rubygems'
require 'ruby2ruby'
require 'ringy_dingy'
ringy_dingy = RingyDingy.new nil
ring_server = ringy_dingy.ring_server
loop do
pid, block, element, idx = ring_server.take([:dmap, nil, nil, nil, nil]).last(4)
begin
result = eval(block).call(element)
rescue Object => err
result = err
end
puts "Got #{result} from #{element} for #{pid}."
ring_server.write([:dmap, pid, result, idx])
end
From the shell:
$ sudo gem install RingyDingy $ sudo gem install ruby2ruby $ ring_server & $ ruby mapreduce_runner & $ ruby mapreduce_runner &From irb:
> require 'mapreduce_enumerable'
> (1..100).to_a.dmap { |v| v * 2 }
Labels: mapreduce ruby code

6 Comments:
It's official. Blaine is the new Larry Page!
Like some earlier attempts, it's missing the reduce part, but you managed to add code mobility which is definitely a step forward :)
The reduce part is pretty trivial; the implementation is essentially the same as for dmap, but uses ruby's inject method, instead.
Likewise, one could create distributed each, find, etc., methods.
missing .rb after the script name
ruby mapreduce_runner.rb &
is maybe the right one
It may be the most trivial thing in the world, but it isn't really MapReduce until it Maps and Reduces... is it?
I Googled for "MapReduce ruby" and ran into your post. I saw your picture and thought, "Hey wait a minute! I've seen that guy before!" I guess that's what happens when you interview at half the companies in Silicon Valley!
Happy Hacking!
Post a Comment
Links to this post:
Create a Link
<< Home