Friday, April 15, 2005

Text Mining Value Slowly Emerging

One of the sessions I attended at the recent DCI Portals, Collaboration and Content Management Conference in Scottsdale, Arizona, "The Word on Text Mining", with Seth Grimes of Alta Plana Corporation, shed additional light on the emergence of viable applications for this intriguing technology. Text mining plus more sophisticated entity extraction techniques have moved the technology from theoretical to practical. Law enforcement and intelligence agencies have adopted the technology to sift through the plethora of seemingly unrelated data to find meaningful patterns, but there's limited available information on the actual techniques. The same techniques can be applied to fraud detection and risk management, yielding objective scores. Call centers are also experimenting with the same type of analysis to extract meaning from random events embedded in volumes of telephone calls.

Applications in the publishing world remain elusive, with the most promising being in the realm of drug discovery, where analysis of the medical literature may yield new insights into chemical compounds and clinical results. More customers are demanding text mining rights when purchasing premium content, particularly journals, in anticipation of future needs for their organizations. Publishers are reluctant to grant these rights, not understanding that usage will be slow to develop and the need to nurture innovative applications of traditional content that can yield future value to both the customer and the publisher. It's time to relook at some old models that have yielded high value, specifically citation analysis in the scientific literature where value is discovered by documenting which authors are citing each other in scholarly research, a business built by the Institute for Scientific Information (ISI). Google PageRank came out of the same construct by analyzing links between websites, so watch for developments in this area to spot new advances lurking around the corner as the technology is applied in new contexts.
Post a Comment