microformats

Liminal Existence

Clouds in Iceland

Monday, May 12, 2008

Scalability

Updated: Go read Steve's Dynamic Languages Strike Back. It's a longer read, but it's much more interesting, and he's much smarter than I am.

LOL. <-- this is a link. Read Ola's post, first.

For all those who don't get it, languages don't scale, architectures do.

Now, some languages are faster than others. That means that to complete a given operation, it costs less, everything else being equal. Costing less is a good thing. But developers also cost money, so if you have to spend money on developers' time porting from one language to another then you might not be saving any money at all, and really you're just treading water.

Once upon a time, Shell Scripts were used to write CGI applications. With the correct architecture, and enough money, you could build Google with tcsh. No, really. It wouldn't be fun, and you'd be dumb, because there are much cheaper ways to do it. But then again, if you stuck with it, perhaps you'd optimize tcsh to be really fast at spawning and serving up web requests. Faster than Java, faster than <insert your favourite language here>. Faster means cheaper, it doesn't mean more scalable.

I point to exhibit A. Perl used to be slow. Now it beats JoCaml with the bestest concurrency (re: “Scalability”) around. What was Perl built for? Parsing text. Lots of it. All the time. It's fast. Does it mean that you can't build Wide Finder with another language? Absolutely not. Does it mean that you couldn't build Wide Finder to scale out to a trillion documents with gawk? If you answered “yes”, go back to the start of this post and read again! :-) If you're still answering “yes,” try reading some more. Leonard, Ted, Joe, Cal, and Theo are good places to start.

If you answered “no,” congratulations! Pat yourself on the back for knowing what scalability means.

Labels: , , , , , ,

16 Comments:

Anonymous meangrape said...

*sniff* I used to use shell, awk, sed and Makefiles to generate some pretty decent websites. Good times, good times.

Monday, 12 May 2008 04:30:00 GMT+01:00  
Blogger Click-a said...

We think you did a great job and we thank you for twitter blaine!

Monday, 12 May 2008 15:30:00 GMT+01:00  
Anonymous Anonymous said...

Yes, scaling and performance are orthogonal concerns. Congrats, you managed to achieve neither.

Monday, 12 May 2008 15:31:00 GMT+01:00  
Blogger Blaine said...

Anonymous: well, that's debatable. If you weren't an AC, I could fill you in on just how scalable it is. But alas, you're just a troll, and trolls don't get to learn.

FWIW for those reading this post (hi!), I'm not talking about Twitter here. If you follow the link to Ola's site, you'll notice that there's been an ongoing debate about whether or not languages can be scalable. For people building websites to be even debating that point is stupid and harmful. I'm the first to admit that Ruby is slow, but that has nothing to do with scalability.

Monday, 12 May 2008 15:52:00 GMT+01:00  
Anonymous Anonymous said...

Two points:

The blogosphere in general won't care but from a programmer's perspective there is a difference between languages and frameworks. There is a difference between Ruby and Ruby on Rails, does the abstraction provided by Rails hinder scalability? Not trying to rehash an old debate about RoR specifically but it should be kept in mind.

Also you say paying programmers to port to another, ostensibly more scalable, language is just treading water. That doesn't make sense as the port will be a one time level of effort, which will have to be paid for but going forward both programming teams would have had to provide the same new features but had the porting expense been incurred, over a longer time line, the overall cost would decrease as the system scales.

m2c,
Justin

Monday, 12 May 2008 16:14:00 GMT+01:00  
Blogger Blaine said...

Justin, I think you missed my point - you can't port to a “more scalable” language. There's no such thing, and it doesn't make sense to say that.

If the port isn't going to take very long, and you're porting from, say, Ruby to C, then you're going to get a cumulative cost savings that could very well be higher than the initial porting cost. Memcached is no longer written in Perl - Brad re-wrote it in C because memcache was used so much that the incremental performance improvement from Perl to C was significant as a total cost metric.

Regarding frameworks, you're right --- Ruby on Rails doesn't scale, out of the box. I've never said otherwise. However, one of Rails' strengths is that it's really easy to swap out components. You're not bound to using ActiveRecord to generate your queries.

Monday, 12 May 2008 16:53:00 GMT+01:00  
Anonymous Anonymous said...

I must have missed the part where Twitter was so scalable while yelling at yet another 500 error...

Languages are not inherently scalable, except maybe Erlang, but some platforms have VERY good packages pre-built to allow developers to write simple non-concurrent code and have the package deploy and manage it in a grid, for example, to make it linearly scalable. In much the same way that RoR makes simple web app development fast, those pre-built platforms make linearly scalable application development fast.

Also, at its heart Twitter is a messaging platform... WTF is RoR doing in there?

Monday, 12 May 2008 17:39:00 GMT+01:00  
Blogger Blaine said...

Anonymous #2: Erlang isn't inherently scalable for all problems.

Hint: Twitter is throwing errors on the web right now, but the messaging back-end (SMS & IM) is working fine.

As far as RoR + Twitter is concerned, two years ago before anyone but those of us working at Odeo had even imagined Twitter, it was a pretty out-there idea. Last year, most of the blog posts about Twitter were comments about how it was just stupid navel-gazing. Building in Erlang wouldn't have made any sense, because Twitter would have never been built if we'd tried. I point you to the XMPP PubSub Spec (XEP-0060) which describes all of Twitter's functionality and more. It was written years before Twitter was built.

Scaling Twitter as a messaging platform is pretty easy. See Mickaël Rémond's post on the subject. Scaling the archival, and massive infrastructure concerns (think billions of authenticated polling requests per month) are not, no matter what platform you're on. Particularly when you need to take complex privacy concerns into account.

Monday, 12 May 2008 18:44:00 GMT+01:00  
Anonymous Anonymous said...

Languages may not scale much in the abstract, but practical issues with their dependencies have enormous eventual impact on uptime which is seen as an issue with "scale".

Libraries, runtimes, availability of drivers, etc., all affect architectural decisions. An immature "language", really, software package dependency graph, doesn't always have out-of-the-box solutions for common problems. Options are limited. The wheel is reinvented. Poor choices are forced. Workarounds for performance reasons are patched in. Soon, you have spaghetti. Then, bugs. Then, operational issues. Then small changes to keep up with growth lead to complex unintended side effects, and downtime. The site "scales", perhaps linearly, perhaps not, but it's brittle. Very brittle. Tools to diagnose and fix are even more immature, almost axiomatically. The hole gets deeper. Code base gets bigger. A few clever hacks here and there keep things going. Growth continues. Observability is nil, and boom, hard limit surprises are reached. More downtime. Then, fear of changing a brittle system brings paralysis and loathing.

Where did this all start? With Ruby.

Tuesday, 13 May 2008 02:39:00 GMT+01:00  
Blogger Blaine said...

Anonymous #3: For someone so well written, I'm disappointed you opted to remain anon.

To address your points, I'll start at the end. Scalability problems didn't start with Ruby, and won't end with Ruby. You cheapen your argument with childish, short-sighted claims like that.

I think your description of the process of scaling a site is pretty accurate, and describes a common experience. You're underplaying the importance of operational reliability; the point of scaling is to be able to add machines, not fix code. If you get into a state where you're constantly trying to eek out performance instead of focusing on building a reliable system, then of course things are going to seem brittle, because by definition they'll always be breaking.

As far as observability is concerned, you can't possibly argue that tools for Ruby are worse than those for PHP or Perl or Python, which have been used to build sites much larger than Twitter.

Scala's not a panacea, which was what Ola said in his post to start with. And he's a big fan of the JVM.

Tuesday, 13 May 2008 03:05:00 GMT+01:00  
OpenID Scott Kahler said...

To a certain aspect I agree but also disagree. The latest technology panacea seems to be these damn frameworks: RoR, Django, Catalyst or whatever. These give the illusion that a developer need not be bother with this whole architecture thing. It also seems a language gets equated to the framework and thus the language in essence becomes the architecture. Using language A in many instance assumes you are using architecture B, I suppose you could go against the norm but for your average joe the resources aren't there to fly far from the feeding grounds. For that reason I'd have to say language doesn't have to be, but it generally is tied to scalability.

Tuesday, 13 May 2008 07:11:00 GMT+01:00  
Anonymous Anonymous said...

Blaine,

Thanks for this post and for all the links regarding scale. I'm guilty of being one of those "RoR can't scale" people... I said this out of hearsay and not out of experience. Guilty. :(

I, for one, am taking this opportunity to learn and I appreciate your sharing.

Do I like RoR? No. RoR and the RoR "Community [Cult, really... like Digg]" does not gain a fan out of this mess... but it does get one less person saying "it can't scale".

Doesn't the fact that RoR is so slow (expensive) make it a really poor candidate for production architecture, however? Even for quick prototyping? PHP or Python are aptly suited for quick prototyping if a developer is keen on repurposing both function libraries and sql files... Wouldn't you agree?

Tuesday, 13 May 2008 16:52:00 GMT+01:00  
Blogger Blaine said...

Thanks for the comments --- All I can say is that web development sucked before frameworks. Writing code in Ruby and Rails or Python and Django or whatever you want is *fun*.

My original point: unless your framework makes it difficult to do new things (claim: Rails does not, and in fact mostly gets out of your way, particularly compared to that crappy code you wrote in PHP that does some SELECT statements against your shared MySQL server), it doesn't inhibit scaling.

Twitter is about 20% Rails at this point. The other 80% is Twitter, the application, the framework. Any big application looks like that. Now, you can say that Twitter didn't scale, sure. Something that's been said before, that I think bears repeating, is that sites are more than just the raw code that you run.

Tuesday, 13 May 2008 17:53:00 GMT+01:00  
Blogger hehe said...

"We've made some pretty significant progress towards scaling Twitter..."

"One of the consistent problems we've been facing is errant queries. We've been seeing (off and on) queries like" [here comes a reference to an AR bug]

http://romeda.org/blog/2007/06/select-from-everything-or-why-databases.html

so you admitted that scaling and certain characteristics of a webframework are related (or can be related). so then what are you talking about?

Friday, 16 May 2008 12:16:00 GMT+01:00  
Blogger cgerrish said...

Hey Blaine,

I just listened to your engagement on the Gillmor Gang. I want to give you props for getting into the conversation. It's a tough venue, but one of the best on the network.

The XMPP user space is the next frontier. The Web is becoming more real time and guys like you need to start thinking about the mufflers and plumbing.

Take care of yourself.

Saturday, 17 May 2008 04:46:00 GMT+01:00  
Anonymous saksit said...

Anonymous: well, that's debatable. If you weren't an AC, I could fill you in on just how scalable it is. But alas, you're just a troll, and trolls don't get to learn.

Monday, 30 November 2009 18:44:00 GMT  

Post a Comment

Links to this post:

Create a Link

<< Home