Moderated by Gary Stein, senior analyst with JupiterMedia.
Mark Fletcher – Bloglines.com
Bloglines.com started in June 2003. Currently there are 600 million blog articles and 1.4 million feeds indexed with 2-3 million blog articles posted per day.
Bloglines has mainly been known as a web based aggregator where users can subscribe to and read multiple RSS feeds.
Mark overviewed the Bloglines search features including content search of feeds and a citation search to see who is referencing a blog. Bloglines also offers additional features including: most popular feeds and most popular links.
As part of Ask Jeeves, Teoma search technology will be implemented with Bloglines to improve its search capabilities.
Gary Stein – What prompted you to create a web based aggregator rather than use a client site program?
Mark – He started with a bookmark list of 150 sites in 2003 which was not practical to manage, so he started investigating solutions to manage that kind of information. A client side aggregator did not work for him since he used multiple computers and operating systems. Plus, there are additional features available running a server based aggregator that are not available client side.
Greg Linden – Findory.com
Findory is a blog and news search engine that learns, adapts and finds new and interesting content for users based on behaviors on the web site.
Content is personalized based on user’s use of the web site. The more information the site has on a user’s behavior, the better the recommendations.
Personalization helps when you don’t know exactly what you want.
Types of content on Findory.com: news, search, advertising.
With Findory everyone sees something different based on their behavior.
Gary – How do you avoid pitfalls of personalization, fickle users with changing interests?
Greg – The key is to look at the users most recent behaviors. It’s a combination of everything you’ve done and what you’re doing at the moment.
Scott Rafer – Feedster.com
Currently indexes 12.5 million feeds.
Rafer’s presentation was pretty much a pitch for Feedster.
Feedster performs just like a search engine. However, you can subscribe to search results as well as the individual posts.
Rafer talked about the value of advertising to users that subscribe to topic feeds, or keyword based feed subscriptions.
Feedster has run ad campaigns for Microsoft, Sun, AMD on Slashdot and Topix.
They are in the process of developing a system for smaller publishers.
Feedster does not crawl web pages, only RSS feeds. Feedster captures data on the feeds in its index so ads can be targeted.
Feedster is now providing blog search for AOL and is testing with Slashdot. The Boston Globe worked with Feedster to add content from blogs on the Boston Red Sox.
Gary – How do feeds get in and out of Feedster?
Scott – New feeds are crawled via links and also from pinging services like pingomatic.com
Dave Sifry – Technorati.com
State of the blogosphere
Started in 2002 and doubling in size every 5 months. Tracking 14.3 blogs. On track to double in another 5.5 months.
80,000 new blogs per day. 13% of all blogs update weekly or more. 900,000 posts per day.
“Untold story” Blog growth in China has been significant via MSN Spaces and Blog China.
Technorati median time to indexing new content is 5 minutes. Side note: With a regular search engine, it could be days or weeks.
Almost a third of blog posts use tags or categories. Adopted by blog tools as well as publishers. While the use of tags has grown significantly, the number of new tags created has been constant.
Blogs are gaining on mainstream media in terms of influence.
There is a correlation between syndication (availability of a RSS feed) and blog popularity/authority.
Sifry then showed a Flash movie of tag usage since Jan of this year.
Gary: What about the difference in meaning regarding tags – synonyms, etc.
Dave: You can do both. If there is more than one way to tag something, you can tag both. Technorati will be able to analyze the most important associations with the different versions of tags. The users define the structure and taxonomy rather than a pre-defined ontology.
Jim Pitkow – Moreover.com
Moreover mines millions of feeds and combines with editorial filtering.
Moreover gives access to what user base is thinking of an event by segmenting “influential” blogs (4,000) and non-influential blogs (millions).
Blogs are not treated as news; they are treated as a seperate piece of information.
Moreover sources content from publishers including: MSN spaces, Typepad, Flickr, Yahoo 360, Live Journal and Blogger
All changing information can be a feed.
blogs, press, eBay auctions, NetFlix queues, craig’s list, pod casts, evites, Flickr, horoscopes, news, calendars, weather, grocery lists, Amazon carts
What’s being done today is the groundwork. Now are the early days and the infrastructure needed will need to adapt.
Challenges to for blogosphere growth:
– Scale, 250m updates per day
– Spam resistant
– Technical standards
– What are the business models that will succeed?
Gary – What are some of the things blog owners should be aware of to ensure relevant rankings within blog search engines?
Jim – Relevant to who? First step is inclusion. Use Feedmesh and Pingomatic for inclusion. Focus on content and be a member of a community.
Other responses: Update consistently and speak with a real voice.
Q – Is there a blog version of Wordtracker?
A – Not really. Technorati displays daily top keyword phrases. You can also query terms to discover search volume on specific phrases on Feedster to gain a measure of popularity.
Q – Are blog search engines going to play a role in helping to thwart stealing content via RSS?
A – They are working on various solutions. Employ duplicate detection.
Overall the distinction between the major web search engines and blog search engines is that traditional search engines may be most useful for discovery search, but blog search engines will provide access to what’s happening right now.