I had a very interesting meeting about a week and a half ago with Robert Cook, the co-founder of Metaweb, i.e. the people behind Freebase. By sheer coincidence, we know someone (non-technical) in common, and he was visiting New York, so it all worked out. I certainly learned a good amount. For one thing, it was a pleasant surprise to find out that he’s a very friendly and personable guy. The meeting also cleared up some misconceptions I had had about Freebase, and their future plans. I had always thought of Freebase and Semantic MediaWiki as rivals - friendly rivals, perhaps, but still creators of similar products, possibly competing for some of the same customers. And if Wikipedia ever started using SMW, I imagined we’d become pretty much direct competitors, since the other co-founder of Metaweb, Danny Hillis, has referred to Freebase as “Wikipedia for data”. But it turned out that, far from fearing or being skeptical Wikipedia adopting Semantic MediaWiki, Robert was very excited about the idea, and wanted to know what he could do to help. As I found out, Metaweb sees Freebase more as an aggregator of data than an original source of it (that’s my understanding, anyway). In other words, though users can directly add information to Freebase through the form interface, the much more important source is sites like Wikipedia, MusicBrainz, EDGAR, etc. Freebase’s strengths lie in matching up entities (i.e., knowing that data about a book from two different databases are about the same book), as well as querying and browsing - they have an extremely fast storage and querying system for their millions of items of data, and some slick interfaces for browsing through it all (see Parallax). So a two-part solution suggests itself: Wikipedia, with some sort of semantic capability, handles the entry and display of data, along with basic aggregation, like lists and tables (and possibly maps and timelines, etc.); while Freebase takes in the data, then handles the complex browsing and querying that Wikipedia probably couldn’t allow, for performance reasons. Other sites could allow for querying and browsing of Wikipedia’s data as well, of course, but Freebase looks like they’re in a unique position to handle it all.
There’s also Freebase’s entity match-up, which is at the heart of Freebase’s new Common Tag effort. The idea is to, instead of using plain text tags for blog posts, news articles, etc., use Freebase entity IDs instead - so that there won’t be ambiguity about what a tag means. It’ll be interesting if this initiative takes off - as Robert noted, it’s not a substitute for true semantic triples, but it beats having “an ambiguous relationship to an ambiguous entity” (my recollection of how he described current tags).