The Web of Data and OntologiesOne of the new things the architechts of the internet have been concerning themselves lately is the semantic web. Some authoritative links are here, here and here. The idea is that there are lots of things out there which are data but the web considers mostly things that are documents. The world will be a better place when computers can make connections between these things. This involves two concepts, one of which is the thing and the other of which is the connection
Things on the WebNot everything is on the web. I for example, am sitting in my living room and am definately not on the web. Therefore to locate me, I need some kind of identifier. These are called Uniform Resource Identifiers (URI). Mine could be something like http://davebridges.github.com#davebridges. URI's need to be unique and they need to be available on the internet. Anything could have a URI, and something could have several URI's. The key is that a URI should not belong to more than one thing. Things which have multiple URI's can be crossreferenced with specific vocabularies (ie owl:sameAs).
Connections (Ontologies)Once things are on the internet, the basis of linked data is how these things relate to one another. For example, this blog post was created by me. So if there was some kind of explicit statement connecting this, any computer could figure out that I wrote this post, or inversely that this post was written by me. The connections are defined by specific vocabularies or ontologies. For example dublincore is a vocabulary about documents, and includes a term "creator". Therefore one could create a link between me and this post by writing something like this:
This Blog Post has a Creator named Dave Bridges
The important thing is that the ontology specifically defines the relationship between two URI's. Given this knowledge, a computer could generate the creator of the page, or all pages created by me.
How Would This Work in ScienceWhat got me thinking about this was how it would be great to have defined vocabularies to describe experimental results. For example if there was an ontology that described a protein-protein interaction (there is, its at http://bioportal.bioontology.org/ontologies/39508), one could use, for example two PubMed links as URI's to could indicate a molecular interaction and the two proteins. Given a large enough catalog of these it would be possible to get a list of all molecular interactions for a particular protein.
What About Non-Cannonical Findings
I might talk about this later, but one thing important would be to not just be able to obtain a list of interactions, but also links to the specific data supporting (or refuting that point). Ideally this would go deeper than just a link to the paper, but maybe a link to a separate URI describing a particular experiment.