SearchMob!


Powered by Rollyo

Recent Comment
Spotlight

  • Reader Hercule DB writes: Each of us has a choice to make. How much privacy do we demand? What price freedom? We should rather live in a free world troubled even by threats from terrorists, than one in which individuals or organizations in whom I have little trust have open access and therefore control over our lives. [go]

Recent Comments

  • Brian: " Neil's comments ("...these are the darke ..." [go]
  • janicemc: " John, I agree that health care is the v ..." [go]
  • tophatsolutions: " I could not agree anymore with the above ..." [go]
  • John: " Will Google have some bumps in the futur ..." [go]
  • Stephan Pretorius: " The impact of a free ad server by Google ..." [go]
  • Güzel Resimler: " it is great for adsense members ..." [go]
  • Horoskop: " There is not only technical dimension, b ..." [go]
  • Kamal Jain: " I am not a geek. I am one of the early u ..." [go]
  • Peter Thurston: " Yahoo mail / Hotmail more usable than Gm ..." [go]
  • tophatsolutions: " "There are also a few attempts going on ..." [go]
  • Branchen: " Very interesting to read - even after so ..." [go]
  • Kamal Jain: " Macbeach, you are absolutely right, Micr ..." [go]
  • ~ SearcH EngineS WeB ~: " His Blog was just HACKED by Dark SEO - n ..." [go]
  • macbeach: " "Eric Schmidt understands this difficult ..." [go]
  • Kamal Jain: " Many folks keep saying that the next sea ..." [go]
  • tophatsolutions: " I don't expect advertising to ever becom ..." [go]

PERFECT FOR THAT PERSON WITH EVERYTHING
Order 'The Search'

thesearch_bookcover.jpg

Yup, it makes the perfect gift for that officemate or colleague who you thought had everything....including you! If you order here, I promise to sign it, assuming we can figure out the shipping...

You can also buy the audio version here.

Check my book page for more info.

Blogger's Rights

Top Posts

Active Topics

Monthly Archives

About John Battelle

Searchblog Newsletter

Enter email to subscribe to "Re-Find", Searchblog's weekly newsletter:


Calendar

April 2007
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

Syndicate

Powered by

October 12, 2004 10:31 AM

Google's Web 2 Demo and the UI Plunge

labs_logo2As many have already noted, last week at Web 2.0 Peter Norvig, Google director of search quality, demonstrated word clustering, "named entities," and machine translation technology to the audience. The translation software was impressive, but somehow lacked zing - "good enough" translation doesn't seem like much of a revelation anymore. That in itself is an extraordinary achievement - Norvig showed translations from Arabic and Chinese - both significantly distinct languages compared to English. Google already has translation features built into its engine (from a third party), but this hand-rolled stuff was far more powerful, it seemed to me.

In any case, the demos that really got the audience going (and me, to be honest) was the named entities and the clustering technology. Seeing anything behind the veil of Google's real research and development is of course a revelation, but seeing something that was so clearly ready for prime time felt rather close to a declaration of where Google is heading, in particular given the recent moves in the personalization and clustering space from Amazon, Ask, Vivisimo, and Yahoo.

"Named entity extraction" is a relatively new project called which Norvig said Google had been working on for about six months. As Norvig explained the concept - essentially identifying semantically important concepts and the meaning wrapped around them - I couldn't help but think of WebFountain and my wish (near the end of the post) that Google would add a bit of IBM's semantic peanut butter into its PageRank chocolate.

Norvig also showed an entertaining (and live) demo of clustering, which he claimed was the "largest bayesian database of clusters" extant. Hmmm.

From the eWeek story covering the news:

For example, Norvig said, researchers are looking for ways to break down sentences by looking for a phrase like "such as" and grabbing the names that follow it. The goal is to not only pull out the name but also its clusters, so that a name such as "Java" can be associated both with the computer language and with language in general, Norvig said.

"We want to be able to search and find these [entities] and the relationships between them, rather than you typing in the words specifically," Norvig said.

This has potentially interesting implications in next-generation ranking methodologies, for one, but combined with clustering, it signals that Google is serious about taking what one might call the UI plunge.

What do I mean by that? Well, of all the major engines, only Google has strictly maintained what might be called the C prompt interface to search: put in yer command, get out yer list of results (Google Local is a departure, but it's still in beta). Yahoo, Ask, A9 and others have begun to twiddle in pretty significant ways with evolved interfaces which - by employing your search history, your personal data, clustering, and other tricks - deliver more filtered and intentional results (though it is still arguable if they are more relevant). I sense it's only a matter of time before Google takes this approach as well, and Norvig's demo certainly points that way. After all, it's not that often Google decides to give us a glimpse behind the curtain, and coupled with Google Board member John Doerr's semi-announcement the day before (he told the audience that Google would become "the Google that knows you") I think the UI plunge might come sooner than we all expect.

If you want to know more about how Google is thinking about clustering, here's a paper written by a Google team, courtesy of a link from Don Park.

Update: Lazy linking on my part, the clustering paper is about hardwaree (though it is really interesting...)


TrackBack

Listed below are links to weblogs that reference Google's Web 2 Demo and the UI Plunge:

» Named entity extraction from alex wright
John Battelle describes an intriguing glimpse of Google's forthcoming search clustering technology, from last week's Web 2.0 conference: "Named entity extraction" is a relatively new project called which Norvig said Google had been working on for about... [Read More]

» Named entity extraction from Stone
Google continues to doreally cool stuff.... [Read More]

» Web Search Clustering from Microsoft (and other Clustering Tools) from Search Engine Watch Blog
Yesterday, we blogged about a discovery that allows you to receive MSN Search Beta results via RSS. It will be interesting to see what Bill G. and company does with this feature in the future. Today, even more MS search news. An excellent post on the S... [Read More]

Comments

IMHO, I think that the 'The Google Cluster Architecture' PDF document you link has nothing to do with the technology Peter Norvig explained.

The document explains how the X,000 Google servers are clustered in order to run quickly and help with users' queries.

And what Peter explained on 'Web 2.0' was how Google plans to cluster search results by learning the meaning of the web pages.

You're right! Sorry about that. I should not rushlink, as I did to that page. My bad.

Never mind.

Your mistake is a good sample of the use of clustering (one word, several meanings). Perhaps you used Google to search "Google+cluster", and the first result is the document you linked. But it wasn't the "cluster" you looked for.

It's good to see some technical hints from Google, although not much to go on.

Machine Translation - destined to be damned with faint praise. Chinese isn't hard to translate; most of the relative difficulty of the language lies in learning the characters. Japanese and Korean have a much more unique grammatical structure. No idea on Arabic.

Named Entity Extraction (as opposed to unnamed entities? nouns?) - This sounds like Google Sets. IIRC they were scanning text for list-type noun phrases (eg North, South, East and West), and building up associations from them.

Clustering - sounds like they've rediscovered data mining. It's not clear what algorithm they're using, but I can think of a few that might work with a large inverted index.

There's a new paper from Google in OSDI 2004

MapReduce: : Simplified Data Processing on Large Clusters
http://www.usenix.org/events/osdi04/tech/dean.html
http://people.cs.vt.edu/~gback/MapReduce.pdf

I think that Googl's new context translation is a great thing.

Post a comment

Human detector
Please enter the letter "d" in the field below. If you want to preview your comment before posting, enter the secret letter after previewing, not now, as the letter will change upon preview.

Enter the letter from above:

Searchblog Classifieds!

Recent Jobs

Searchblog, in paperback

Searchblog
Print Edition

Get Your Own Print Version of Searchblog

Get the book

Click here to buy a customized print version of the entire contents of Searchblog.

Categories

Search Resources

License