Here’s my beef:

The semantic web is the future of the Internet – smart, understanding applications that can grasp meaning from context. The dream of AI researchers – conscious understanding of semantic meaning from unstructured data.

That dream implies reasoning and intuitive understanding and learning, it has fueled AI research for decades and is still going strong. Unfortunately, the term ‘Web 3.0′ and the idea of the semantic web have been bastardized as the onus of data organization has been put on humans to produce machine-readable content, only tro make it human readable again!

It is in the nature of man to be expressive, that expression comes in the form of design, art, architecture and implied, subjective meaning that can only be hinted at through educated analysis. It’s all around us, in our print media, our on line media and in the things and artifacts we hold in our hands. Meaning is ascribed to these objects in ways that are unstructured and implied.

why should we perform a ludicrous process of:

  1. Create content
  2. Re-Structure Content for machines
  3. Use machine’s to re-structure the content for people
  4. Read content?

It seems a brazen waste of resources – and what is worse is that instead of moving towards an implicitly understood web of semantics, we are being asked to put the ontology and organizing principles of data on the content creator, data entry operator or programmer.

I believe meaning is derived in relation to the perceiving entity – all things are directly relevant to the perceiver in ways that define the objects usefulness as a tool. Essentially I believe the issue that should be being addressed is self-programming applications, machines that adapt their functionality in order to comprehend meaning & to develop tools based on the data they are presented with.

If applications are throw-away and ‘die’ (exit(0)) yet the data gathered is stored centrally, then mistakes can in fact be corrected through continued self-abnalysis and recursive mutation.

Naturally teaching a machine to program is tricky, yet perhaps it wouldn’t be – perhaps the ‘mutations’ shoudl simply be sets of algorithms which can be turned into code and brought to the fore when time requires it – the only organizing principle required would be to ensure the tools are losely coupled as pure input-output boxes that can be re-shuffled to eventually produce quality data extraction.

This would also require a teacher of some sort – however I believe that social media and bookmarking sites such as digg and del.icio.us will take care of that as their API’s and tag clouds can be tapped to gather ‘hard verification’ of meaning.

I am no AI expert, but I do feel that the way to go is to follow the evolutionary model of organization through chaos and adaptation.

This evolutionary approach to data acquisition and understanding is ideal for the web, since it is a vast environment of ecosystems where certain applications are ‘dominant’ in how they present data, therefore self-developing AI’s would evolve specialized tools to acquire data from those sources (a funny interpretation of this principle by xkcd).

This adaptive approach would encourage platform / framework vendors to keep developing innvative applications without the hindrance of extreme standardization (everything must fit inside a box!) while allowing new data representations to thrive while the AI adapts.

developing a web spider / bot that can crawl links and data in this way and build a semantic vocabulary statistically (much like Google’s search engine) would produce a newer, faster way of indexing the net in an understnadable, machine-interpretable way without forcing users to standardise their content.