WARNING: This article features ANCIENT code! I'm keeping it online because it's interesting to see what I was thinking 10+ years ago. But you DEFINITELY should not be using this code. Anything you're reading about on this page has changed significantly since this was written.
This is super old and super cool - an article from 2007 in the New York Times about using Amazon/EC2/S3/Hadoop to produce static PDFs from 71 years (4TB) of NYT archives. I don't think I had even heard of Hadoop in 2007.