The next big thing after Google: the Semantic Web

Imagine that the whole internet was one big organized database. Instead of being a chaotic jumble of disorganized information, it was structured and disciplined. Instead of simple keyword searches in Google, you could enter commands like: “List all the companies in Mexico who export sombreros and have more than $10M revenue.”
This is the goal of what has been called the “Semantic Web” and is the brain-child of Tim Berners-Lee, the inventor of the world wide web. As he points out, shock headlines such as “Google could be superseded” (The Times) actually miss the point—Google, of course, is right at the forefront of this transition.
The Web is a vast collection of human knowledge and information which is almost completely un-indexed. In order to make the Semantic Web work, metadata (information about information) has to be added to web pages. For example, an article about a corporate acquisition could have structured lists of facts attached to it, such as Company A acquired Company B on this date, for this amount. Any given page would likely contain multiple facts.
These facts can belong to different categories: there is geographical information, industry sector, language, dates, people etc. So a single event, like a new CEO being appointed to an airline in Germany in August 2008 would contain facts in multiple categories.
The impact of the Semantic Web for business will be phenomenal! The whole internet could be mined for information to support decision-making or to find new markets. One example is the use of geographic data to display information in the form of maps, like average company growth in 2008 viewed across North America. The potential unreliability of individual sources would be offset by the statistical reliability of large numbers.
So how soon are we going to see this happening? The mechanisms, such as RDF (Resource Description Framework), have already been defined, and tools are becoming available. An experimental free service by Thomson Reuters called opencalais will scan news stories and web pages, and attempt to automatically extract facts in RDF format. Website frameworks such as Drupal have already been extended to both send and receive this kind of information.
Just as some companies made fortunes and some were left behind in the first wave of the internet, the same will happen again as the new web emerges.
- Andrew Fountain's blog
- Login to post comments

