Thursday, June 24, 2004

SIIA Brown Bag Panel - Enterprise Search: Adding Value to Content Behind the Firewall & Beyond

It was an honor to chair and moderate the panel for this SIIA event yesterday at the offices of Holland & Knight: we had about 70 attendees listening intently to contributions from the senior management of ClearForest, Factiva, FAST Search & Transfer, Endeca, ISYS and MarkLogic, all representing unique perspectives on the enterprise search industry. Being in the moderator's seat it was a little difficult for me to capture my usual blow-by-blow notes and somewhat harder yet to squeeze in all of the questions that I would have liked to have asked given the size of the panel and time constraints. Here, though, are some of my quick takes on the discussion, both what was said and what was not said:

Interface is king as much as source materials and algorithms. Being able to get at content via search mechanisms has moved beyond the "my algorithm is better than yours" phase to some degree, even when they matter, to the point where people are getting usable content at their fingertips based on a query. That usability is wrapped up strongly in interface design and content navigation capabilities build in to an interface. Thus you have an interface such as ClearForest's which is an excellent tool for sophisticated visual analysis of content and Endeca's Guided Navigation that gives people drill-down categories highly tuned to the search results and other approaches that assume not only that documents are text blobs lined up in a row but objects with extractable and sophisticated attributes that take their content beyond mere text and into the realm of salient answers.

Taxonomies matter, but it's still a struggle to implement them cost-effectively. Karin Borchert of Factiva came up with taxonomies as the linchpin of search capabilities two years from now in our visionaries "lightning round" of questions; with many of the advances being seen in taxonomy development and caretaking I'd have to agree that taxonomies have yet to come into their fullest use and fruition. With corporate compliance efforts and complex analysis tasks pushing institutions to classify more content than ever, it's likely that taxonomies will continue to be highly important tools. But as Lee Phillips of FAST pointed out, taxonomies oftentimes have a lot of care and feeding associated with them for full effectiveness unless you want to take the "inch deep, mile wide" approach of some providers. Good work for information professionals and corporate librarians, to be sure, but hard for them to keep in sync with what technology can provide to them. When taxonomies are highly adaptive to the content in their purview they can do wonders, but without good semantic processing and componentization of content from mere text into an array of usable content objects their power is limited.

The "Magic Box" has believers - in users. During the panel discussion there were some interesting exchanges on the "magic box" approach to search favored by open Web search engines such as Yahoo! and Google, the single source of answers that can hopefully interpolate human needs and express them in search results. People believe in the magic box, even when what they need may not be the most popular thing that appears at the top of a list of answers. Sometimes, as with a product such as ClearForest's, the answers lie not in a list but in seeing relationships amongst entities. Sometimes, as in Endeca's approach to navigation, the answer may come through exploring search-contextual categories as much as it does through seeing a listing of documents. But even when these approaches do provide value, the essential problem that most search technologies grapple with is that most people don't give a hoot about using sophisticated technology to get answers to their questions. A machine's job is to understand us, not vice versa. The unspoken truth is that most approaches to search today are still very weak in trying to understand the human aspects of what people are looking for in a query, and will remain so for quite some time to come. In the meantime companies have products to sell, so "magic box plus" solutions will be the norm for the foreseeable future.

Google isn't the enemy, they're just a media company (not). Jeff Cutler was kind enough to ask a pointed question about the presence of Google in the corporate space, given their efforts with search, Gmail and their enterprise-oriented search appliance. The strong consensus of the panel is that they're a nonplayer for the most part, a media company trying to make some interesting things happen but not a serious player in enterprise search. On the face of it this is an accurate answer: while Google makes money on its hardware/system-driven search appliances often enough, their approach to enterprise solutions selling is a non-starter from a practical standpoint, with a sales approach that has all the sensitivity of a county coroner. Google is not an I.T. company from this perspective, to be sure. Yet there are a number of highly successful content companies who have succeeded in enterprise content that have not wooed I.T. managers to get their products in the door - Bloomberg, LP's eponymous and originally "black box" financial content systems relied on sales to individuals to get their foot in the door to build what is today arguably the most successful financial content company in the world. What Google represents is an extremely well funded company with a search interface that the world loves that is doing its very best to romance each and every desktop in the world with an understanding of content from a highly human perspective. No, you won't see Google succeed for a long time in a big way with enterprise search: they don't have to. All they need to do is to drain off enough dollars being spent on "pretty good" solutions at the enterprise level to keep potential competitors from getting enough strength to take them on at the enterprise desktop level with solutions that don't require I.T. budgets for the most part - yet. The space between Google and more entrenched players in enterprise search such as Verity and Autonomy is going to get more narrow rather quickly unless search specialists decide how they want to address this march towards personal knowledge management. Players such as ISYS and ClearForest are in a very interesting position to capitalize on this movement - if they can get the right pieces in place.

XML's day has barely begun. I sense that some of the comments from Paul Pederson of MarkLogic regarding the importance of XML in search may not have penetrated as deeply as they could have - no fault of his, just a matter of awareness, I assume. No matter, they were very salient, for the surge of content creation towards XML-based formats has just begun. Even humble weblogs are based on these highly manageable content transport standards, making it easier than ever to represent content "under the bonnet" in a highly useful manner while allowing end users to consume that content in any number of useful formats. The metaphor of a text document as an analog of a piece of paper is coming closer to running its course as the requirements for searching and storing content become ever more sophisticated. In this environment every piece of content will have the potential for highly flexible reuse in any number of venues, something that Microsoft understands very well at the macro level, even if their real-world implementations are lagging in exploiting this capability effectively. As Lee Phillips of FAST put it, "Don't limit yourself to search, think like a publisher." XML is one of the keys to help people focused on search to think about how content can be made far more useful to people in a much more flexible and contextual manner. Bottom line, search IS all about publishing content, and making that content valuable to specific audiences in specific venues is the key to future profitability for these companies.

My best wishes to all of the panelists, you all have very interesting products that will help to shape the face of vContent from quite some time to come. Also our great thanks to Jeff Cutler, President of the SIIA Content Division for calling on us to help pull this together. Let's do it again some time!
Post a Comment