Continuing the theme of the Future of ECM … trend #7 …
In this future trend of ECM, I talk about how semantic technology is likely to play an increasing role in facilitating the discovery of information, connecting it with the people that need it, often before they even know that they need it, thereby giving it greater value. This introduces the concept of semantonomics (semantic economics), the art of deriving value from information.
Infoglut
As the volume of digitally available information continues to grow, the problems associated with “information overload” are becoming more prevalent. In everyday life, people are having more and more information pushed at them from every angle. Even if all the information was perfectly classified and ordered, the reality is that people just don’t have time to actively parse through it all. The volumes are simply prohibitive and logistically reading and sifting through vast quantities of information so as to seek the content of interest is impractical. In fact the method that most people use to survive this “infoglut” is to ignore most of it. This can especially be a problem with mobile devices which have much smaller screens, and where you really do need to have the information that is relevant and important to you displayed in a manner that catches your attention very quickly.
So we have a problem; a significant amount of information that is produced in most organisations is not read by all of the people that ought to be reading it, or would certainly benefit from reading it. This can fundamentally de-value the information. Therefore, in order to maximise the value of information, it needs to be targeted and connected to a qualified audience.
It is worth noting that social collaboration tools are also likely to play an essential and overlapping role here in order to dynamically discover people’s interests and skills from analysis of social activity, providing increased intelligence in order to be able to connect relevant information to them. This is discussed in the ‘connecting the dots’ section of my blog The Collaborative Office.
Discover
Performing a Google search for the term “Oasis” will return over 4.5 million results, including the Oasis clothing store, Oasis the rock band, the web standards organisation Organization for the Advancement of Structured Information Standards (OASIS), the Oasis Beauty and Day spa, in addition to an oasis as a fertile spot in the middle of a desert. If I were a producer in the record industry and specifically interested in Oasis the rock band, then the vast majority of the search results would be irrelevant to me. I could, of course, add additional keywords to refine the search query, but fundamentally it doesn’t understand the meaning of “Oasis”, nor what it means in my individual context, which will always limit the quality and relevance of the search results.
In the next evolution of the web, web 3.0 (described in my blog Evolutionary Road), Semantic technology will mature enabling the meaning and context of information (such as documents, web pages, blogs) to be truly understood by rendering an insight into the relationships between words, and the ambiguities that words and phrases can sometimes present. This will facilitate a web of connected data that has been semantically enriched with sufficient metadata to enable machines to interpret it, permitting them to find, share and integrate information more easily and automatically. For example, a semantic search might be “I need to find somewhere for lunch for myself and my 3 year old son, preferably not too noisy, I’ve got a budget of £30 and need to be back home by 3pm, summarise my options?”
Semantic technology represents a complete step change in terms of how we currently search and discover information on the web. It is envisaged that every user will have a unique web profile, tailored based on their browsing experience, interaction on social collaboration and networking sites, with different weightings given to information that is more of interest to them, etc. This means that different people will get different search results, even though they might search for exactly the same thing.
Semantic technology will play an increasing role in facilitating the discovery of information, connecting it with the people that need it, often before they even know that they need it, thereby giving it greater value. There will be a much greater focus on the concept of semantonomics (semantic economics), the art of deriving value from information.
The underlying XML technologies of the semantic web, such as RDF (Resource Description Framework – the grammar), OWL (Web Ontology Language – relationships between terms), and SPARQL (SPARQL Protocol and RDF Query Language – the rules) make the semantic search example above possible by allowing information to be read across the web by machines. However, there are a number of obstacles to be overcome before the semantic web reaches a tipping point where it can go mainstream. For example, there is a lot of work to be done to enrich data for RDF and create detailed ontologies and rules around the data. Even within a single organisation, not alone the wider web (with over 30 billion web pages), it can still be a considerable task.
Nevertheless, inroads have been made through the use of text mining/analytics software which can automatically parse, say a document, and identify key concepts, context, meaning, entities (people, places, events) and categorise these according to a taxonomy, enriching the document with intelligent, semantic metadata. This represents a transformation step to discover the business value in “unstructured” information, connecting it with the people that need it.
Visualise and explore
An essential aspect of information discovery is to be able to visualise and explore the information in a manner that brings it to life and maximises its value. Significant advances are being made in this field all the time, three examples of which are:
- Microsoft Pivot (www.microsoft.com/silverlight/pivotviewer) is an innovative, highly visual user interface (powered by Silverlight) to allow you to explore and arrange large collections of information, discovering patterns and relationships between information that would otherwise be difficult to spot through standard browsing techniques. An excellent demonstration of Pivot can be viewed at www.ted.com/talks/lang/eng/gary_flake_is_pivot_a_turning_point_for_web_exploration.html.
- Google Squared (www.google.com/squared) is the first significant effort by Google to understand and extract information from across the web about a particular term/phrase, teasing out structure from unstructured data, looking at the semantics and relationships between information, and presenting a summary of what it has discovered about the selected term/phrase (currently in table-like format). It still has quite a way to go in order to be genuinely useful, however, it is an interesting start;
- Concept Searching (www.conceptsearching.com) provides functionality to automatically identify and extract concepts from content, intelligently classifying the content and dynamically building a taxonomy over it. Although primarily focused on SharePoint, it is also feasible to point it at documents in a file system (that perhaps won’t be migrated into an ECM), automatically classify the documents and dynamically build a taxonomy on top of them, greatly facilitating much more effective information discovery. Active Navigation (www.activenav.com) provides a similar tool.