I wish there was a way to tell the search engines to only index the “content” area of a page. For example, on a blog, only the text from the title through the last comment should be indexed. Everything else, site navigation, ten most recent post titles, ten most popular text titles, footer text, should not be indexed. On my theater site, the right sidebar has a list of the shows that are coming up the next week. But when I search the site with Google, and happen to search for a show that is currently playing, every page on the site comes up. This isn’t useful.
One of the web sites I used to manage used a Perl-based search engine, Fluid Dynamics Search Engine. You could put comments in your site templates so that it wouldn’t index sections of a page. This worked very well. Their FAQ has an entry on this: How to prevent sections of your pages from being indexed. I want this functionality in the big search engines as well.