Categories
BlogSchmog

Freebasing Info

Tim O’Reilly blogged about this last March, and I’ve been patiently waiting for my invite in the interim. It came (along with one from Spock) while I was under the dark side of the Internet moon the past few days. After viewing the initial tutorial video to get my bearings, I’m ready to jump into this communal database project as a fall side project.

Freebase (“Free + Database”) follows the basic wiki model of a communal editorial staff of writers feeding the site. Unlike a wiki, though, the content being captured is encapsulated in a formed database. It isn’t that members compose open prose about a place, event or person; They do so in small cells of data that are structured in a way to ease retrieval and re-use. One of the big motivations for this project is not consumption of data at the Freebase site but rather application data through third-party development using the Freebase API.

As with Wikipedia, contributing members spew forth information into the community site about films, sports, politics, music, science and anything peripherally connected to that knowledge. The structured data not only allows for easy retrieval but also is a nice way to explore and improve the site content. Since everything is cross-linked by content and data type, a navigable network forms as you edit. All Freebase data is licensed under Creative Commons, so its use only costs an attribution link.

At the moment, the young project boasts data profiles on 356K people, 60K towns, 23K films, 9K books, 305K musical acts and just 600 websites. This is likely the result of targeted data scrapes and dataset uploads, as well as community “data mob” projects, like the current Hometown Pride effort to expand the depth of member locales. But it is impressive that there are already 902 types of data and some 2.4 million topics created.

The Freebase project is an extension of MetaWeb, a San Francisco company dedicated to building a better infrastructure for the Web. The MetaWeb team includes people with experience working on Netscape, Alexa, Intel and Broderbund. More information is available on their FAQ.

On the surface, Freebase seems like a latecomer challenge to Wikipedia to become the open repository of knowledge for the globe. It might also be viewed as a redundant effort of the Open Directory Project, a comprehensive human-edited directory of the Web, or even AboutUs, a one-year-old wiki of websites. However, Freebase is in many ways an extension of these tools or perhaps a bridge between them. It would be wonderful if duplication of data in MetaWeb could allow for more and better hooks into these other tools by connecting the things that do overlap.

What will be most interesting to observe as Freebase grows is how its content ultimately reflects use of the MetaWeb data through third-party API. If a car enthusiast site, for example, suddenly started leveraging MetaWeb, then the Freebase site might become a leading resource for information about both cars and the people excited about taking care of them. Locally, if Bloomingpedia—celebrating their third year of existence detailing our community history—were to find a way to port their data into MetaWeb, the view of the world as seen through Freebase could have Bloomington, Indiana as the center of the universe. The inevitable attraction of spammers and legitimate business marketers will also be a sign of growth and acceptance. As an informaticist, that might be an interesting study to connect use of API with body of data.

As with most Alpha tools, there are glitches and aggravations. I wasn’t able to successfully upload an avatar image for my profile. Search completed with no results but a “Search in progress…” message remaining on screen. The link to Woodstock, Illinois worked (though it took a long time to load) but the one to Bloomington, Indiana did not. The feedback tool also generated a “transient server” error, so I couldn’t even use the site to let MetaWeb know about these things. Growing pains, all. It doesn’t detract from the promise this new community might have in organizing Internet content.

Related news: There was apparently an explosion in downtown San Francisco yesterday that knocked out power, affecting Freebase a little. No data was lost, except whatever was in the Sandbox at the time. Yet another reason I am glad I could fly directly to San Jose.