Linked Data, Free Pictures, and Markets for Semantic Data
Share this Session:
  Paul Houle   Paul A Houle
Database Animal
Ontology2
http://ontology2.com/
 


 

Thursday, June 7, 2012
10:45 AM - 11:30 AM
Level:  Case Study

Location:  Imperial B

Ookaboo is a collection of about 1,000,000 Creative Commons images gathered from social media to 500,000 Linked Data concepts from Freebase and DBpedia. Ookaboo’s semantic API and RDF dump let applications connect topic such as people, places, species and things to free pictures with almost perfect precision.

To create Ookaboo’s photo collection and user interface, I had to extensively clean Linked Data and construct a knowledge base about “commonsense” topics such as grammar, the relative importance of things, offensiveness, and the categorization and naming of things. Had this knowledge been commercially available, I could have more time acquiring images and building a community.

Although free Linked Data defines a shared vocabulary that enables interoperation, next generation text analysis, data integration, and content generation systems will depend on reusable knowledge bases that take resources and specialized skills to create – a market in semantic data will fill this need.


Paul spent most of the 1990s getting a PhD in physics, using Java for computational physics and educational applets. Since then, he spent a decade developing e-publishing, e-commerce, e-community and e-business applications with a focus on database-backed web sites and Rich Internet Applications – he’s been a developer, sysadmin, DBA and webmaster for hundreds of sites including the Global Performing Arts Database and the arxiv.org Open Access publishing effort. Today Paul specializes in the construction and exploitation of large knowledge bases derived from Linked Data sources such as Freebase and DBpedia. Most recently years he’s developed Ookaboo.com, a large collection of images associated with Linked Data concepts, and has helped Xen integrate data from Freebase into a social media platform. His current interests are in social-semantic systems, intelligent extract-transform-load, and the use of world knowledge in language processing.


   
Close Window