SearchMob!


Powered by Rollyo

Recent Comment
Spotlight

  • Reader JG writes: ... YouTube ads like this fly in the face of everything "relevance" based ...(it) is a complete reversal of everything [Google] ever stood for. A non-relevance-based graphical video overlay? How is that not just a banner ad? And wasn't the whole fire and fury behind Google's rise, Google's takeover of the net, founded on a rejection of the "banner", the DoubleClickian "gaudy and irrelevant", approach to web advertising? [go]

Recent Comments

  • Vendetta: " muaaah! Boohoo... Poor Large media compa ..." [go]
  • Mitesh: " google and yahoo are best. MSN comes aft ..." [go]
  • Search ☸Engines ☸Web: " All of those that HATE Google now, shoul ..." [go]
  • BolcaSohbet.NET: " Thanksss ..." [go]
  • Marc Burch: " I have noticed that the current term she ..." [go]
  • nmw: " "Google's share of web searches must rem ..." [go]
  • Shawn: " What is the site submission deadline? ..." [go]
  • gosia: " So it's fun to note that John. ..." [go]
  • Filmiki: " Very good article, and very informative! ..." [go]
  • Search ☸Engines ☸Web: " Also, we are not charging a fee to ..." [go]
  • Stone: " This is incredibly sad. I'm not big fan ..." [go]
  • islam: " thank you ..." [go]
  • Mike: " Warning: Geek Alert "Yahoo sh ..." [go]
  • daniel: " I posted a nice quote from Petronius Arb ..." [go]
  • bavajan_seo: " what work should be done to force images ..." [go]
  • jungdear: " Hey!. nice to meet you My name is jungde ..." [go]

PERFECT FOR THAT PERSON WITH EVERYTHING
Order 'The Search'

thesearch_bookcover.jpg

Yup, it makes the perfect gift for that officemate or colleague who you thought had everything....including you! If you order here, I promise to sign it, assuming we can figure out the shipping...

You can also buy the audio version here.

Check my book page for more info.

Blogger's Rights

Top Posts

Active Topics

Monthly Archives

About John Battelle

Searchblog Newsletter

Enter email to subscribe to "Re-Find", Searchblog's weekly newsletter:


Calendar

September 2007
Su Mo Tu We Th Fr Sa
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            

Syndicate

Powered by

April 21, 2004 2:50 PM

Nutch Update

logo_nutchThe recent announcement of Mozdex, which is leveraging the Nutch open source engine, reminded me to ping Doug Cutting and see how things were going with the Nutch project. He replied that while Mozdez only crawls a few million pages, it's a start, and he was pleased to see folks starting to use Nutch. He also pointed to ObjectsSearch, another site which uses Nutch.

But Doug said that his focus these days with Nutch is not to try to get a major, open source alternative to Google or Yahoo out there, though that remains a long term goal. Instead, he reports:

I'm opting for organic growth: get some users and developers
will follow.

In this vein, I put together a demonstration a few weeks ago for Oregon
State University. They love it. It's at:

http://devjr.cws.oregonstate.edu:8080/en/search.html

Compare this to their Google appliance at:

http://search.oregonstate.edu/web/

The quality is pretty close, and the price a lot less. It took me about
20 steps to build that demo, I want to reduce that to just a couple, to
put it within the grasp of any campus webmaster. Then I'll turn it over
to them to operate themselves.

I'm also contracting to build a Nutch-based search engine for the
Creative Commons, searching everything which uses one of their licenses.

Meanwhile, folks at a few universities are starting to use Nutch as a
platform for larger-scale search experiments.

Combined, these efforts should continue to push Nutch's scalablility at
the same time as build an installed base, all without having to first
find a sugar daddy.


Comments

I'm also putting together an install of nutch. I've got a few servers dedicated to the experiment. I'm going to go for more of the niche vertical market segment and see what happens. I'm mostly interested in tweaking algorithms for specific niche industries based on the demographics of the typical "searcher" in that segment.

I've said it before, users don't want a bunch of hoops and complicated steps to jump through in order to get high quality results. They want to punch in a url, type their keywords and hit search. It's that simple.

I'm just thankful that guys like Doug Cutting are working on a project that's going to save all of us a lot of time and energy. I'll be contributing as much as time permits to Nutch because I see it as an equalizer for the small guy.

I'm all about Power to the People!

Interesting to see Akamai are now offering a search service based on Lucene and their network of edge servers. (Lucene being the open-source search engine used by Nutch)

http://www.akamai.com/en/html/services/edgecomputing_search.html

I am writing regarding and NEW, if any, information relative to the present status of Nutch. My searches have only resulted in information ending in the 2004-3005 time frames. Any help relative to current, as in July 2006, information realtive to Nutch's staus will be appreciated..

I am writing regarding and NEW, if any, information relative to the present status of Nutch. My searches have only resulted in information ending in the 2004-2005 time frames. Any help relative to current, as in July 2006, information realtive to Nutch's staus will be appreciated..

Post a comment

Human detector
Please enter the letter "m" in the field below. If you want to preview your comment before posting, enter the secret letter after previewing, not now, as the letter will change upon preview.

Enter the letter from above:

Searchblog Classifieds!

Recent Jobs

Searchblog, in paperback

Searchblog
Print Edition

Get Your Own Print Version of Searchblog

Get the book

Click here to buy a customized print version of the entire contents of Searchblog.

Categories

Search Resources

License