SearchMob!
Recent Comment
Spotlight
- Reader Hercule DB writes: Each of us has a choice to make. How much privacy do we demand? What price freedom? We should rather live in a free world troubled even by threats from terrorists, than one in which individuals or organizations in whom I have little trust have open access and therefore control over our lives. [go]
Recent Comments
- Brian: " Neil's comments ("...these are the darke ..." [go]
- janicemc: " John, I agree that health care is the v ..." [go]
- tophatsolutions: " I could not agree anymore with the above ..." [go]
- John: " Will Google have some bumps in the futur ..." [go]
- Stephan Pretorius: " The impact of a free ad server by Google ..." [go]
- Güzel Resimler: " it is great for adsense members ..." [go]
- Horoskop: " There is not only technical dimension, b ..." [go]
- Kamal Jain: " I am not a geek. I am one of the early u ..." [go]
- Peter Thurston: " Yahoo mail / Hotmail more usable than Gm ..." [go]
- tophatsolutions: " "There are also a few attempts going on ..." [go]
- Branchen: " Very interesting to read - even after so ..." [go]
- Kamal Jain: " Macbeach, you are absolutely right, Micr ..." [go]
- ~ SearcH EngineS WeB ~: " His Blog was just HACKED by Dark SEO - n ..." [go]
- macbeach: " "Eric Schmidt understands this difficult ..." [go]
- Kamal Jain: " Many folks keep saying that the next sea ..." [go]
- tophatsolutions: " I don't expect advertising to ever becom ..." [go]
PERFECT FOR THAT PERSON WITH EVERYTHING
Order 'The Search'
Yup, it makes the perfect gift for that officemate or colleague who you thought had everything....including you! If you order here, I promise to sign it, assuming we can figure out the shipping...
You can also buy the audio version here.
Check my book page for more info.
Blogger's Rights
Top Posts
- The Database of Intentions (or how this all got started)
- From Pull to Point(or the first post where I riff on the "Point-To Economy")
- Google As Builder (or the point at which Google stopped being simply a search engine)
- On Google v. Yahoo
- TV and Search Merge
- On Sell Side Advertising
- Battelle Gets Searchstreams
- Search and Immortality
- Toward the Endemic (on endemic advertising)
More coming soon...
Active Topics
- 25 comments: Conversational Marketing: PGM v. CM, Part 3 (03.09)
- 24 comments: A Modest Proposal To YHOO and MSFT: Spin Out A Search Company (03.13)
- 21 comments: Microsoft Deal For Large Customers: Use Live Search, Get Free MSFT Products (03.15)
- 12 comments: Ballmer On Google - Uh Oh (03.18)
- 11 comments: Meanwhile, A MSFT v. Google Battle Brews... (03.06)
Monthly Archives
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005
- September 2005
- August 2005
- July 2005
- June 2005
- May 2005
- April 2005
- March 2005
- February 2005
- January 2005
- December 2004
- November 2004
- October 2004
- September 2004
- August 2004
- July 2004
- June 2004
- May 2004
- April 2004
- March 2004
- February 2004
- January 2004
- December 2003
- November 2003
- October 2003
About John Battelle
Searchblog Newsletter
Enter email to subscribe to "Re-Find", Searchblog's weekly newsletter:
Calendar
| Su | Mo | Tu | We | Th | Fr | Sa |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
| 29 | 30 |
Syndicate
Powered by
October 12, 2004 10:31 AM
Google's Web 2 Demo and the UI Plunge
As many have already noted, last week at Web 2.0 Peter Norvig, Google director of search quality, demonstrated word clustering, "named entities," and machine translation technology to the audience. The translation software was impressive, but somehow lacked zing - "good enough" translation doesn't seem like much of a revelation anymore. That in itself is an extraordinary achievement - Norvig showed translations from Arabic and Chinese - both significantly distinct languages compared to English. Google already has translation features built into its engine (from a third party), but this hand-rolled stuff was far more powerful, it seemed to me.
In any case, the demos that really got the audience going (and me, to be honest) was the named entities and the clustering technology. Seeing anything behind the veil of Google's real research and development is of course a revelation, but seeing something that was so clearly ready for prime time felt rather close to a declaration of where Google is heading, in particular given the recent moves in the personalization and clustering space from Amazon, Ask, Vivisimo, and Yahoo.
"Named entity extraction" is a relatively new project called which Norvig said Google had been working on for about six months. As Norvig explained the concept - essentially identifying semantically important concepts and the meaning wrapped around them - I couldn't help but think of WebFountain and my wish (near the end of the post) that Google would add a bit of IBM's semantic peanut butter into its PageRank chocolate.
Norvig also showed an entertaining (and live) demo of clustering, which he claimed was the "largest bayesian database of clusters" extant. Hmmm.
From the eWeek story covering the news:
For example, Norvig said, researchers are looking for ways to break down sentences by looking for a phrase like "such as" and grabbing the names that follow it. The goal is to not only pull out the name but also its clusters, so that a name such as "Java" can be associated both with the computer language and with language in general, Norvig said.
"We want to be able to search and find these [entities] and the relationships between them, rather than you typing in the words specifically," Norvig said.
This has potentially interesting implications in next-generation ranking methodologies, for one, but combined with clustering, it signals that Google is serious about taking what one might call the UI plunge.
What do I mean by that? Well, of all the major engines, only Google has strictly maintained what might be called the C prompt interface to search: put in yer command, get out yer list of results (Google Local is a departure, but it's still in beta). Yahoo, Ask, A9 and others have begun to twiddle in pretty significant ways with evolved interfaces which - by employing your search history, your personal data, clustering, and other tricks - deliver more filtered and intentional results (though it is still arguable if they are more relevant). I sense it's only a matter of time before Google takes this approach as well, and Norvig's demo certainly points that way. After all, it's not that often Google decides to give us a glimpse behind the curtain, and coupled with Google Board member John Doerr's semi-announcement the day before (he told the audience that Google would become "the Google that knows you") I think the UI plunge might come sooner than we all expect.
If you want to know more about how Google is thinking about clustering, here's a paper written by a Google team, courtesy of a link from Don Park.
Update: Lazy linking on my part, the clustering paper is about hardwaree (though it is really interesting...)
- Posted by John Battelle on October 12, 2004 10:31 AM
remember this »- Sphere It
TrackBack
Listed below are links to weblogs that reference Google's Web 2 Demo and the UI Plunge:
» Named entity extraction from alex wright
John Battelle describes an intriguing glimpse of Google's forthcoming search clustering technology, from last week's Web 2.0 conference: "Named entity extraction" is a relatively new project called which Norvig said Google had been working on for about... [Read More]
- Tracked on October 12, 2004 12:37 PM
» Named entity extraction from Stone
Google continues to doreally cool stuff.... [Read More]
- Tracked on October 13, 2004 12:39 PM
» Web Search Clustering from Microsoft (and other Clustering Tools) from Search Engine Watch Blog
Yesterday, we blogged about a discovery that allows you to receive MSN Search Beta results via RSS. It will be interesting to see what Bill G. and company does with this feature in the future. Today, even more MS search news. An excellent post on the S... [Read More]
- Tracked on June 6, 2005 6:10 AM
Searchblog Classifieds!
Recent Jobs
Searchblog, in paperback
Searchblog
Print Edition
Get Your Own Print Version of Searchblog
Click here to buy a customized print version of the entire contents of Searchblog.



Comments
IMHO, I think that the 'The Google Cluster Architecture' PDF document you link has nothing to do with the technology Peter Norvig explained.
The document explains how the X,000 Google servers are clustered in order to run quickly and help with users' queries.
And what Peter explained on 'Web 2.0' was how Google plans to cluster search results by learning the meaning of the web pages.
You're right! Sorry about that. I should not rushlink, as I did to that page. My bad.
Never mind.
Your mistake is a good sample of the use of clustering (one word, several meanings). Perhaps you used Google to search "Google+cluster", and the first result is the document you linked. But it wasn't the "cluster" you looked for.
It's good to see some technical hints from Google, although not much to go on.
Machine Translation - destined to be damned with faint praise. Chinese isn't hard to translate; most of the relative difficulty of the language lies in learning the characters. Japanese and Korean have a much more unique grammatical structure. No idea on Arabic.
Named Entity Extraction (as opposed to unnamed entities? nouns?) - This sounds like Google Sets. IIRC they were scanning text for list-type noun phrases (eg North, South, East and West), and building up associations from them.
Clustering - sounds like they've rediscovered data mining. It's not clear what algorithm they're using, but I can think of a few that might work with a large inverted index.
There's a new paper from Google in OSDI 2004
MapReduce: : Simplified Data Processing on Large Clusters
http://www.usenix.org/events/osdi04/tech/dean.html
http://people.cs.vt.edu/~gback/MapReduce.pdf
I think that Googl's new context translation is a great thing.
Post a comment