Saturday, 22 September 2007

I Don't Get Jeff Atwood

I read quite a few blogs, on different topics, but there is (obviously) a whole bunch of computing ones in there. I read some of the big name ones, like Joel on Software and The Old New Thing, as well as some that could possibly be regarded as second-tier in the blagosphere like Intertwingly and Steve Yegge.

But one big name blog I don't read is Coding Horror, by Jeff Atwood.

His posts appear pretty often on programming.reddit.com or digg.com, friends (who I respect) read him and recommend his articles, but I just can't figure him out. Sometimes his articles manage to completely miss the point, and other times he reasonably succinctly describes a simple concept.

For example, the post Building a Computer the Google Way contains an historically interesting photo of Google's first server. This server is interesting because it is fairly clearly a cobbled together set of home-built servers with some custom designed hardware. These servers were built by Google from the PCB under the motherboard on up. Sure, they used commodity parts, but I can guarantee there are some unique pieces in there. Probably in the vicinity of the interconnects.

This is an interesting example of how a successful company will control everything that possibly relates to their business. You can't afford to rely on the off-the-shelf components for anything that might be technically critical. For Google, indexing and querying speed were hyper-important. So they built their own hardware. Amazon did something similar when they wrote their own web server. Always remember these example when you're railing against Not-Invented-Here syndrome.

Unfortunately, Coding Horror proceeds to use the Google example as justification for why every good programmer should build their own PC instead of just buying one from Dell. I'm sorry, but I can't see how reading specs on PCI slot counts on a motherboard are going to lead to building the kind of server that could support Google's load, or perform their proprietary indexing any faster.

However, on other occasions, Coding Horror does actually manage to explain concepts accurately and clearly. Though I'm usually left thinking, this only occurred to you now? See Everything is Fast For Small n and You're Probably Storing Passwords Incorrectly.

He does, however, write well. And there lies a real skill. If only he could be brought up to speed on some actual modern computer science and software engineering concepts, we might have something.

7 comments:

Damana Madden said...

I used to read him but his inane comments gave me the sh*ts so I unsubscribed.

Testing Testguy said...

If you go to the computer history museum in Mountain View, you can view this bit of history yourself (along with almost every other computer ever made) in the Visible Storage area. It's an amazing treat.

These servers were built by Google from the PCB under the motherboard on up. Sure, they used commodity parts, but I can guarantee there are some unique pieces in there

All the hardware you see in the google rack could have been purchased from an online store in 1999-- CPUs, memory, motherboards, hard drives, etcetera. The only thing "custom" about it is the physical layout, which some would say is amateurish-- even sloppy. There's nothing unique about it, which is the whole point. Rather than pay Dell or IBM to build servers, they decided they could roll their own-- using nothing but commonly available commodity parts-- more cost effectively. Their servers are like doritos: use all you want, we'll make more.

You can see a similar ethos at work with their software, which is built on commodity open source underpinnings. I'd argue their software is far more customized than the hardware, though.

Giles said...

Hi Jeff,

Back in 1999, when Google was doing their magic, I was a student with the research group that eventually evolved into this company. While there I was playing with indexing technology running on supercomputers. Our techniques weren't anywhere near as good as Google's, and our data set wasn't anywhere near as large. But to get any reasonable performance we had some very, very funky backplane interconnect networking going on. Thing is, from the front the machine just looked like a cluster of Sun SPARCs of some description - unfortunately, I forget which.

My main point about the original Google server was that building your own PC has about as much in common with Google building a server as changing your car oil yourself has with an F1 engineering team designing a new engine.

It's not enough to just buy off the shelf components and stick them together. Don't bother if you're just going to build something identical to what Dell would sell you. Only head down this path if you need to do something special, like add a fancy backplane network, or a custom clustered, column based DBMS.

Giles

Testing Testguy said...

> Don't bother if you're just going to build something identical to what Dell would sell you

Google would have paid easily 5 times as much for the same hardware from Dell. The point was to use commodity parts-- lots of them, cheaply and easily replacable commodity parts-- instead of outlaying massive cash for complex, unnecessary server hardware from a company like Dell, or HP, or IBM.

Great things can come from humble commodity hardware. And for far less than you might think. That's the lesson. Have you seen this motley assortment of hardware?

http://www.codinghorror.com/blog/archives/000305.html

Here's a specific example. Look into Intel's so-called server-class "Xeon" chips, and you'll find they're identical to their Core Duo/Quad equivalents. The only difference, in fact, is the pricetag, and the name etched on the top of the chip.

Giles said...

Hi Again Jeff,

I think we might be talking slightly at cross-purposes here.

Buying and using whatever cheap hardware you have hanging around is a brilliant way to save money. When you're starting a company that's what you want to be doing. I've been there. The last company I founded used an ancient Power Mac G3 running Linux as our email server...

To me, the more significant and interesting thing about the early Google hardware is the stuff that wasn't off the shelf; the custom networking, the custom file systems, etc. That's the sort of thing that gave them their competitive advantage.

Of course, I can't be certain that they were using custom pieces of hardware. But looking at Larry Page's history, it seems likely. And I do know that other search technologies at that time were using custom file systems.

> That's the lesson. Have you seen this motley
> assortment of hardware?
> http://www.codinghorror.com/blog/archives/000305.html


Thanks, I hadn't seen that before.

Regards,
Giles

Colin said...

I still enjoy reading Coding Horror. I don't always agree with what Jeff writes (nor do I read all of the topics), but I do find that many posts get me thinking.

BTW, you used to work on Funnelback tech? I know a few folks from there...

Giles said...

It wasn't Funnelback then. This was back when it was still just a CSIRO researcher (David Hawking) with a couple of Ph.D. candidates. They ran an Honours course in web search technology. The main focus of their research was on searching the 'dark web.'

For the course, they got us to try out a bunch of different web search algorithms. I implemented an indexer based on a technique from the University of Waterloo, in Canada. The novel aspect was that updates to the index were linear in the size of the additional documents, not the size of the entire corpus.

Anyway, with hindsight, the most interesting thing is that every single one of the ideas, algorithms and approaches has gone absolutely no where. Even though some of them were interesting and had value separate from searching the entire web.

Ground and obliterated under the Google hegemony.