

That way you don't need to do any horrible, complicated stuff and can just let servers do what they are good at, rather than having to write something custom to pull potentially large amounts of data out of a database every time this page is viewed.

Personally, I'd be inclined to not store it in the database at all, but rather spawn a background job to pull down the site, parse it and filter it with your readability port and then save it to the filesystem somewhere (public or non, public, depending on your needs, you can easily write an assets serving controller to expose non-public static content) using a directory scheme that identifies it uniquely.
