post

Google Blog Search: When to Use Full Feed vs. Excerpts

(updated)

Is this the death knell for Technorati, et. al? “ – asks Charlene about Google’s entry to the Blog Search space.
I suspect I know the long-term answer for that, but for now let’s look at what Technorati’s own Niall Kennedy thinks:

Google is specifically restricting its search to feeds, and not using the HTML

of the blog. Why? Googlebot is designed to swallow a page whole and not

break the page up into individual entries or items. Feeds come

prepackaged as individual items or entries allowing for easy digestion

by parsers and indexers. Google would need to overhaul its indexer or

design a new and separate indexer specific to blog posts if it would

like to include more post content than it is currently pulling down

from a page’s link alternate declared feed (this is based on a

conversation I had with Google engineers in February about the indexer,

I won’t blog the details, and things may have changed). Technorati

indexes a blog’s HTML assisted by the declared RSS and Atom feed, so I am admittedly a bit biased.”

Well,

I’m not sure I’d consider Google’s using feeds a disadvantage/weakness:

the fact is, reading the entire HTML may very well be the cause of some

of Technorati’s problems,

i.e. their parser getting “lost”, not finding post boundaries,

associating posts with the titles and tags of the neighboring posts..

etc. (the previous link provides more details as well as a

collection of other blogger’s experience with Technorati). If

it’s so difficult to index the entire blog right, we might actually be

better off with a feed-based search.

Which takes us to Jeff Clavier’s conclusion: “… bloggers publishing only a partial feed will be partially indexed (Aha, would that be the reason for full feeds to become the standard ?)

I

could not agree more. Unless your blog is all about ad-revenue

generation, in which case you need to attract readers to your

site, there is no reason to not serve up the entire post

in your feed. It’s really simple: in this world of

infoglut either you make reading your blog convenient, or expect

to lose subscribers who are fed up with clicking and waiting.

Submitting a ‘bait’ in your feed defeats the purpose of RSS Readers.

That

brings me to a problem I find with my blog platform: there is not

enough control over the smart use of excerpts. My preference

would be:

  • Full post in the RSS feed
  • Auto-created excerpt (say, first 100 words) on the Blog Main Page, with manual override option
  • Hand-edited 2–3 line summary that other blogs can use in the trackback detail.

My

platform (Blogware via Blogharbor) does not support such selective use

of excerpts, and I am not aware that others do it

(?). Oh, well, there is always a next

release.

Update (9/14) Google’s new service already has a nickname: Bloogle ( credit to BL Ochman).
Its game over in the blog search space. “ declares Barry Ritholtz.

Update #2 (9/14) Steve Rubel‘s take, in his test blog: “I have a big concern. Namely, according to Search Engine Watch,

Google does not spider the full blog – only what’s in the site’s RSS

feed. This presents a problem since many bloggers only publish a

summary feed. As a result, the Google Blog Search engine may be missing

a ton of important content.” True…

but again, why look at the symptom, not the root cause: providing full

content in the feed takes care of the “problem” and keeps readers

happy.

 

Update #3 (9/15) Planet OZH shares my views: Five Reasons Why Partial Content Feeds Suck. (he’s got cute baby pics, too)

Update #4 (9/18): Business Blog Consulting agrees.

 

Tags: , , , , , ,

Comments

  1. Great writeup here. On full-text versus partial feeds, I will have to say that unless one is using some RSS metrics tool, it becomes that much more difficult for a blog author to get feedback on what readers like and don’t like as the reader base shifts to RSS.

  2. The 12th Mistake Business Bloggers Make ….

    My Feed reader showed a promising title:  The 11 Biggest Mistakes Small Business Bloggers Make.   Hm  … I normally prefer to read full feeds,  but the title looks promising, let’s read it. Click, load, here we…

  3. Full Feeds or Nothing – but that’s just my vote

    The partial vs. full feed debate is back.  Duncan at Blog Herald provides an overview of the debate. 

    I’ve always made my preference for full feed clear, yet I am still reading your partial feeds, Duncan 🙂  Admittedly, it&rsquo…

  4. I agree whole-heartedly with the advantages of full RSS feeds.

    The only challenge I’ve found so far is that there seems to be an upper limit to the total size of a feed (for many feed tools and intermediaries). As such, having too much total content may cause a feed to fail validation.

  5. I had no clue – is it the length of an individual post, or the total filesize? Wow…

%d bloggers like this: