From information to knowledge: Stop missing the links

Getting from information to knowledge – stop missing the links
Espen Andersen

Information is now so freely available that many suffer from "information overload"—200 e-mail messages waiting every morning, an exponentially growing World Wide Web, and bookshelves and magazines filled with interesting and relevant material one just never gets around to reading. Turning the torrential information stream into a manageable and nourishing knowledge feed is becoming increasingly important, as companies discover that most competitive advantages can be rapidly copied by the competition, and that the only long-term competitive advantage lies in the ability to learn and innovate faster than everyone else.

Getting from information to knowledge—information in context—is done by applying judgement: judgement about the quality of a piece of information, about in which contexts it is relevant, about where to look for related information. Organizations can, in principle, apply this judgement in two ways: Judging the content itself, or judging the way it is used.

Content evaluation: Have a judge decide
Judging and categorizing content itself is best done by humans, who reads information and categorizes it into some system which others can use to locate relevant information. Most companies do this by having a corporate library or knowledge center. Yahoo (http://www.yahoo.com), one of the most popular sites on the Web, uses people to read and categorize Web pages into a hierarchy somewhat similar to the classifications in a regular library, where the user can look under topics and sub-topics to locate relevant Web resources. This is an important and useful service, especially in the early stages of research, when someone is trying to understand what is important within a category of knowledge.

However, human judging carries a number of scale and scope limitations. The people doing the judging must be knowledgeable about the topic, classify it correctly, and be willing and able to describe the content in a useful and succinct manner. This may be possible when classifying plants and species, where evolution is rich but happens slowly. It is not so easy in the world of information, where new concepts are created and old ones pass into oblivion almost daily. Who hasn’t found hilariously misclassified library books and experienced reams and streams of spurious “hits” when searching Altavista (http://www.altavista.digital.com) or other information datanbases? Computers are excellent at sifting through lots of information, but notoriously bad at making correct associations between elements of knowledge.

Value evaluation: You are what you read
The value-based approach is achieved by having a computer judge the value of pieces of information by monitoring how humans use it. This is the approach developed by Firefly and used by Amazon.com to recommend books to visitors. An application monitors how users behave, and then recommends information to other users by matching usage patters. For example, the Firefly Website recommends music and films. When a new visitor first arrives at the web page, he or she is asked to rate 20 different films on a scale of 1 (worst) to 7 (best). Firefly compares this visitor's ratings with those already on the system, and recommends five more films which were rated highly by other visitors with similar tastes. This simple idea is surprisingly powerful and versatile, and can easily be applied to a corporate Intranet, where the computer could track what pages people are reading and assign ratings either by explicit valuations or just by how often a particular item is used.
The limitation of the value-based approach is that it may overfocus on quantity rather than quality. Used uncritically, it may influence the corporate Intranet the same way Nielsen ratings have influenced American television: towards the lowest common denominator.

Technology to the rescue: Meet your friendly agent
For both the content-based and the value-based approach, agent technology holds promise. 'Agent' in this setting means a piece of software (the actual implementation may vary) which interacts with a user and an application. The user specifies the task to the agent—find a cheap CD, say, or locate information on a certain subject—and the agent goes out and interacts with one or more applications to do the task, reporting back to the user. In addition, an agent observes the direct interactions between the user and the application, and suggests further actions to the user based on those those interactions. For instance, it may notice that the user tends to read certain pieces of information first, and so try to find similar information (see http://live.excite.com for an application of this).

The really interesting application of agents, especially on corporate Intranets, will come when standards for inter-agent communication are established. When agents can talk to agents, novices in one topic can have their agents learn by talking to an expert's agent, and transferring the expert’s reading and viewing preferences. This process is similar to when someone walks into a colleagues office while the colleague is on the phone, picks up a book from the desk and browses it. The book is potentially interesting because a respected colleague is looking at it.

Study after study has shown that human decision-makers rely not on formal information, but on the judgement of friends and peers: You ask your friends before making an important decision, because you trust them and their judgment on parts of your decision. An agent-based, connected communications infrastructure where agents talk to agents is a simulation of what goes on in a real organization. It helps humans to be knowledgeable by helping them do more of what they have always been good at—linking things together.

Espen Andersen is Associate Professor of Information Management at the Norwegian School of Management. He is still trying to figure out what all this stuff really means. If you have a suggestion, go to his Website at http://www.espen.com to help him out.

Further reading:

For a report in research into agent-based knowledge searching, see Science News, May 2 1998. (excerpt below is from Edupage)
Companies that deal with recommendation engines: Firefly (now bought by Microsoft), Autonomy Systems
Books on creating “intelligent” behavior from many small, simple systems:

Dennett, Daniel C. (1991). Consciousness Explained. Boston, MA, Little, Brown & Co. (Excellent philosophical book on how the brain works and what consciousness is)
Resnick, M. (1994). Turtles, Termites and Traffic Jams: Explorations in Massively Parallel Microworlds. Cambridge, MA, MIT Press. (wonderfully simple explanation of self-organizing systems, great examples)
Kelly, Kevin. (1994). Out of Control: The Rise of Neo-Biological Civilization. Reading, MA, Addison-Wesley. (exuberant book on biology as a model and a metaphor for societal and organizational evolution)

THE TAMING OF THE WEB (from Edupage)
A team of researchers at Cornell University and IBM's Almaden Research Center have developed a way to narrow down the responses to a Web search inquiry, based on hotlinks rather than just words in a text. Links embedded in a Web page provide "precisely the type of human judgement we need to identify authority," says Cornell's Jon Kleinberg. His software program conducts a standard search based on text only, which is then expanded to include all the pages to which those documents are linked. Then, ignoring the text, the program looks at the links and ranks each page based on the number of links to and from it. After several iterations, the compilation is boiled down to an essential list of information sources on the topic. IBM has applied for a patent on the underlying algorithm. (Science News 2 May 98)

Last updated: April 15, 1999