Wednesday, March 31, 2004

Time for Taxonomies?

We all know that it's tax season, but perhaps we should declare it taxonomy season, too? With so much focus on optimizing Web sites for search engine rankings and paid search listings, little attention is devoted to the importance of implementing robust taxonomies to define standardized and consistent metadata for one's content collection. The major Web search engines recognize the need for tagging content found on disparate Web sites in order to improve contextual matching and personalization algorithms, but is a top-down generalized search engine approach going to be able to tackle the task of developing a taxonomy for everything on the Web?

Last week I attended a seminar on taxonomies that was sponsored by Verity, Inc. Taxonomies are a hot topic. Maybe not exciting, but definitely hot. The room was filled with IT and information professionals who are faced with the tricky task of implementing a taxonomy across their institutions' content collections. With the help of a knowledge expert or an information specialist/librarian, creating a robust and dynamic taxonomy doesn't have to be difficult, particularly if tools and prebuilt taxonomies are available. However, all panelists at this seminar agreed that the trickiest aspect of building and implementing an organization-wide taxonomy involves pulling together and gaining consensus from the people in the various departments whose data, image files, sound, video or other types of content need to be assembled and categorized.

Not surprisingly, Verity specializes in software tools for mapping content into categories--what they call thematic mapping--and providing prebuilt taxonomies, usually build on an industry standard, such as the MeSH from Medline for medical information, or Factiva's company and industry taxonomy. Today's headline story about Verity's work on homeland security was referenced in the seminar, in large part because Verity had built a custom taxonomy in conjunction with the Department of Homeland Security for the project.

A couple of key take-aways from the seminar, which apply to both individual and institutional Web site publishers, as well as Web search engine companies:

--To achieve a quality result in creating and using a taxonomy to tag content, a combination of machine tools (i.e., software) and human review is necessary. One speaker put it this way: "people for quality; tools for quantity."

--A major benefit of having an underlying taxonomy is the ability to create browse categories that enhance text search. Note to search interface designers: user tests show that the best approach includes a text search window as a first step, followed by related browse categories that can be used to refine a search. Alternatively, navigation paths could be devised based on the browse categories that apply to the initial text search.

When the subject is taxonomies, it is easy to get bogged down in terminology: what is a taxonomy vs. a thesaurus vs. an ontology. The Verity speaker did a superb job of differentiating between the terms. I won't try to replicate her definitions. Instead, I'll close with this thought: As institutions increasingly apply system-wide taxonomies to their internal and externally-shared content collections, we will be well along the path toward the Semantic Web, where it will be far easier to determine context of individual sites on the Web and to identify communities of related content. Especially if some standards are adopted.

Newsmap for Google News: One Step Closer to Automated News Layouts?

WebProNews notes the debut of a new tool called Newsmap developed my Marcos Weskamp and his associates. Newsmap provides a graphical representation of headlines for various news sectors that are color-toned and sized based on presumed importance - similar in some ways to heatmaps developed for financial services such as SmartMoney's Map of the Market. The tool is more and experiment than anything else at this point - you'd have to be a pretty patient person to make use of some of the font sizes and angles used to represent relative relevance - but it's a very interesting exercise in automated news layout, more sophisticated in concept than those used by Google, MSN and Yahoo! for their news services. Will there be a time when we pull up an automatically generated news page on our desktops or I-Ink PDAs that has a newspaper-like front page and section layout with banner headlines for major items, etc., based on relevance to our specific needs and interests? Editors out there, don't go aghast at me, since this kind of tool may help in layout as much as replace it as a human art, but this kind of nascent concept should be watched closely for far more practical application.

Tuesday, March 30, 2004

OneSource Announces New Business Content Quality Initiative: Is Quality Control the Future of Aggregation?

DM Review picks up on business content soultions provider OneSource's new quality control initiatives using multiple techniques, including automated monitoring, parsing and tagging of news, content vendor management, corporate Web site reviews, primary research and rapid data correction. As OneSource finds its content increasingly a part of institutional portals that need this information as up to date and accurate as possible for immediate action, the initiative places it in a good position to leverage its strategic use of Web services integration to make those portal presences a seamless and highly reliable part of their clients' operations. It comes also at a time when D&B-backed rival Hoover's is stumbling with data quality issues and when flashy online alternatives such as Eliyon are failing to prove that they have industrial-strength content integrity. As content distribution technology advantages fall away from aggregators, cost-effective quality control is one of the primary strengths on which any enterprise-level distribution strategy must rest - and perhaps becomes the core function for content that may come from licensed sources, public sources and client-supplied sources. Many have promised strong QA for business information, but few have delivered to their clients' needs and expectations, so OneSource is placing themselves in a strong position for improving their market penetration on the top end. Now if only they can do something about the bottom end of the market...

Monday, March 29, 2004

More Search Heat: MSN Testing Weblogs Search, NewsBot

Holy escalation, Batman, when is this all going to end? It's getting down to hand-to-hand combat in the search wars, as The Mercury News picks up on Microsoft's plans for a new weblog search service slated for later this fall and Redmond's nascent answer to Google News, dubbed NewsBot. No real specifics on the weblogs service, other than that it's going to be selective (yes, that's good...), but of equal interest is the maturity of NewsBot, which offers both preset pages and a search tool, similar in scope and effect to Google's well-established beta news service and Yahoo!'s new news service. Overall results in quick tests are pretty good in terms of search relevance, we'll be testing it in more detail over the next few weeks. While I am sure that Google has some concerns about this increased competition, if I am a major news portal, I have to be very worried about these developments. Automated news aggregation is poised to become the default "front page" for a new generation of maturing online readers, providing a level of objectivity in news gathering that editor-guided news services cannot replicate easily. Newspapers still presume the printing press as the central argument of their editorial powers, but the quickly evolving powers of Web search and content presentation tools are becoming the central news gathering metaphor very quickly. Not time yet to short your New York Times or Gannett stock just yet, but it's a quickly evolving landscape that's just begun to get interesting.

Google Gets (Kind Of) Personal with Beta of New Personalized Search Service

As noted by CNET News. Google has launched yet another Beta in the escalating search wars frenzy, this time taking on search results personalization. In true Google fashion, the interface provides a relatively simple way to specify how much personalization one wants in search results - specify categories within a simple taxonomy that interest you, then when you get search results you may slide a simple control from left to right to specify how much personalization you want included in the results. As you slide the control, new items will appear in the results list, with some friendly-looking "Google balls" indicating that they were added to the results from the personalization controls. In some informal search tests, the results do change, but since we cover a lot of areas at Shore that are both specific and broad, I found myself checking off lots of categories, resulting in personalization that really didn't do much to help the quality of results. The taxonomy-driven approach to setting up a personalization profile seems to be somewhat awkward and contradictory to other Google philosophies. Getting concierge-like results in content services requires responding to actual interests, not potential categories of interest. It's all about customer service, not library science - ask Amazon. Much better to have a little box or such when one completes a search that says something like "include this search in my profile", which would have the added benefit of tying into text ads more effectively. While more effective than far more crude personalization services found at Yahoo! and other portals, this is one Beta that's still in need of some additional thinking.

In Premium Weblogs: Caplin Feeds Exchange Data to Instinet, MarketAxxess and ICAP Blend Voice and eBrokering

Shore Senior Analyst Jack McConville reports on how the Caplin move will favor their own Siphon distribution technology, and how MarketAxxess and ICAP are leveraging their joint capabilities for the treasuries markets. Registration required. more...

Friday, March 26, 2004

Elsevier & Cadmus Partner with Usable DRM Solution for Medical Journal Reprints

Digital Rights Management (DRM) systems have been notoriously user-unfriendly, with no perceived value to reader. But new applications are evolving. In an announcement this week, Elsevier has licensed RapidRights from Cadmus Communications to manage their sales of electronic reprints of health science journal articles. Sales of article reprints from major peer review journals is an attractive business, and Elsevier has prestigious titles, including The Lancet and the American Journal of Cardiology. Another publisher, the American Medical Association, receives significant revenue from print individual articles from its flagship title, JAMA (Journal of the American Medical Association). These articles are used as sales and customer service collateral by pharmaceutical and medical equipment manufacturers. The process is clunky, as is all print sales collateral. However, as described in DRM Watch, the RapidRights approach is based on the purchase of a fixed number of copies, and the download simply decrements a counter of the copies. No registration is required, and no plug-ins --both major drawbacks of other DRM approaches, and the format is standard PDF.

Medical content is rapidly becoming more electronic and readily accessible to medical professionals, with a PDA becoming standard equipment for younger doctors. Simplicity is the key to making this work, and this application meets that criteria. And the purchasers, those medical products companies buying reprints, will welcome faster and easier distribution!

Thursday, March 25, 2004

Sony to Debut E Ink Technology in LIBRIe device, Opening Door to Electronic Paper eBooks

As noted in CNET News and other outlets, Sony will release an eBook reading device next month using a display developed by Royal Philips Electronics using technology developed by partners E Ink and Toppan Printing. The device, dubbed LIBRIe, will retail at around $375 according to CNET, and will use Sony Memory Sticks (of course) to store between 20 to 500 eBooks, depending on the account. The key factor in this new product is the E Ink technology that allows for a display that has the resolution, reflective properties and overall look of newsprint. The device is somewhat larger than a paperback book and sports a QWERTY keyboard, presumably riding on top of a Palm OS similar to its Clie PDA products. I don't have one in my hands yet, but I am sensing from all of the clips I've been going through is that this very well could be a "gotta have" gadget for technoholics and readers alike. The form factor is a little wierd, but being able to show people the latest and greatest in display capabilities in an affordable (PDA/premium cell phone range) package in a screen that's far more readable than the typical PDA and far more usable and private than a tablet PC (comes with a handy flip-close cover) has something to be said for it. As we've been telling folks in our logs and research, eBooks are a quiet revolution underway, and this highly affordable and techno-sexy unit may be just the thing to pump up the volume for eBooks - and steal Microsoft's march in the process.

Wednesday, March 24, 2004

EU Fines Microsoft: Will the Content Productivity Logjam Break Loose?

The European Union's $613 Million judgement against Microsoft and requirement to unbundle its Media Player is just the beginning of the EU's actions against Microsoft: according to PC Pro, competitors in both instant messaging and mobile phones markets have accused the company of using Windows XP to strengthen its position. It's all too likely that Microsoft's heavy handed tactics for eBooks will also come under scrutiny as its Reader software gets bundled into its new OSes as it did with Internet Explorer. Content vendors cozy up to Microsoft as a friend that can help them to penetrate the desktop market more effectively, but at the end of the day these vendors are digging their own grave for high-margin profits. The expensive licensing for Microsoft products and other monopolistic I.T. investments that don't pay back in productivity gains proportionate to these investments is handcuffing institutional implementors who would otherwise have money to spend on innovative ways to consume content. With more and more content revenues dependent on such innovation, it behooves professional content vendors to promote the most open market for underlying I.T. infrastructure possible, so that they can become the leaders in vContent that their clients require them to be.

Tuesday, March 23, 2004

While All Eyes Are On Google, Innovation in Online Advertising Occurs Elsewhere

Borrowing from the headline: All Eyes on Google from the March 29 issue of Newsweek, it's fair to say that Google is the talk of the town these days. Search engine marketing and contextual ads served by Google and Yahoo/Overture are hogging the spotlight at the expense of other advancements in Web publishing and online monetization models. But there are other developments in online advertising that merit attention. One notable example is the news that Procter & Gamble (P&G) is expanding its custom Web publishing initiatives with a new health-care related Web magazine called HealthExpressions.com.. This Web site follows P&G's earlier site, HomeMadeSimple.com. While these early publications are chock-a-block with advertisements for P&G products (which makes it very clear who is sponsoring the site), the content-centered advertising vehicles illustrate a trend in advertising--a trend toward advertisers creating their own context in which to place their ads. Offline (i.e., print) examples are appearing on the scene, too, such as a Wal-Mart magazine planned with Time, Inc, which will be targeted for the Wal-Mart shopper demographic. So, while Google, Overture, Kanoodle, and others are busy improving their technology and bid-rate models for placing ads on contextually-relevant Web sites, some top advertisers are taking matters into their own hands and creating the content that is contextually relevant to the demographic they seek. Through this Web-centric publishing and marketing strategy, consumer product companies like P&G can create direct relationships with their customers and learn what products and promotions click with various consumer segments [pun intended]. What can B2B publishers learn from these developments? Perhaps most important, P&G's content-heavy Web sites demonstrate how advancements in institutional publishing technology, along with general acceptance of the Web as an information, news, and entertainment medium, are fundamentally changing the models for distributing content and associated advertisements. Trade publishers beware!

Monday, March 22, 2004

Catching up, Take Two: Factiva Launches iWorker, Toolbar Feature

As noted in our weblog two weeks ago, Factiva has been readying a new single-box search interface for its premium content, which was announced today billed as iWorker, a technology that maps keyword queries into Factiva's multi-tier taxonomy to further refine results. Translation: our back end technology works rather differently than typical search engines, which is good in some ways but also indicative of the difficulties of mixing content from unstructured sources from the culled and categorized sources found in a typical premium aggregator's database. Along with iWorker comes a new Factiva-specific search box widget that can be imbedded in one's browser toolbar, similar to tools developed by Google and AtHoc. Good stuff, and again a step in the right direction, but it's indicative of how far behind most premium aggregators are these days in providing content value. Taxonomies are great, but they're not going to justify the cost of a premium database of 9,000 sources in the long run when public search engines are learning how to integrate premium sources into their own algorithms. Aggregators must accept that the ultimate premium content solution will always go beyond the content that they can license to their users. A painful fact, but one that will be coming home to roost soon enough anyway.

The Trials of Midlist Authors: Why Not eBook it?

A recent Salon article [PREMIUM] chronicles the sad but all too familiar story of one of the legions of authors who struggle to make a living writing fiction and general interest books. Given that it's written by a pretty capable author, it turns out to be an engrossing and touching story: a promising first book falls short of the mark commercially, dooming the author to years of semi-success and heart-wrenching anguish, finally culminating in - gulp! - getting a real job to pay the bills. Hey, it happens. In the meantime, as outlined at the eBooks in the Public Libraries Conference [PREMIUM WEBLOG COVERAGE - REGISTRATION REQUIRED], publishers are beginning to explore the E-P-E cycle for releasing titles: start electronically, go to print when volumes warrant, cut back to electronic and on-demand printing when they start to fade. So much more sensible - and less frustrating for authors who want to make progress in the marketplace. Instead of trying to get PR attention in the general media, then, authors will find themselves using libraries as the minor leagues needed to gain the notoriety that they need to get their much-desired stardom. Publishing is about to undergo some very welcome and overdue transformations thanks to eBooks and library outlets, providing a correction of supply and demand that better meets the needs of both the marketplace and authors eager to make their mark. It may be that editors will be able to get back to the real work of getting talent developed for commercial exploitation, rather than sweating which mega-hit is going to make the quarterly earnings look flush.

Friday, March 19, 2004

In Premium Weblogs: Blocking Ads Draining Away Content Value, eBooks in Libraries Stimulating New Commercial Models

In our Content eCommerce premium weblog, Shore Senior Analyst Janice McCallum reflects on how Norton Firewall's proclivity for blocking virtually any useful advertising, including contextual text ads, is eliminating valuable contextual content in many instances. more...
In our eResources Marketplace premium weblog, Shore President and Senior Analyst John Blossom covers the recent OeBF conference on eBooks in the public libraries. Libraries appear to be the lever that eBooks have been looking for to take off, offering many intriguing possiblities for content commercialization that can easily spill over into other sectors. more...

Thursday, March 18, 2004

All the News That's Fit to Push: News Organizations On the Edge of Online Transformation

The USC Annenberg Online Journalism Review covers a recent comprehensive report by the Project for Excellence in Journalism, an industry effort to profile every aspect of American journalism on an annual basis. As PEJ Director Tom Rosenstiel sees it, "...journalism is in the middle of an epochal transformation, as momentous probably as the invention of the telegraph or television." It's a transformation that puts journalism at risk in the view of PEJ, as the eyeballs are surging towards an online model that is not yet producing the revenues required to support top-notch journalism operations. The stats show that online news is getting youngsters' attention, something that TV and print just haven't cracked, so there's no doubt where the future of news is going. But how quickly? While the study shows that only 15 percent think of online outlets as their primary source of news, 26 percent looked at online news within the last day - about half of today's online population. Will news outlets be able to bridge the revenue gap effectively in era of rapid transition? As noted in our earlier news analysis, there is less and less to hold together traditional news operations as other kinds of content aggregation - including user desktop integration via RSS-fed weblogs - become increasingly powerful. The issue may not be whether news organizations as they exist today will be able to make the transition as much as whether they will be anything like they exist today. It's not just the technology that's changing but the very nature of how news is formed. The social networking and hyperlinking provided by Weblogs, for example, provides much of the source validation provided by traditional journalism, an organic kind of quality validation that precludes many traditional editorial controls. It may not be that journalism quality is declining in the face of online forces as much as it is that online journalism is redefining how quality is achieved, and for what purposes. News formation is a key focal point for vContent today, a concentration of the very human aspects of content with the truly revolutionary aspects of humble but world-changing technologies.

Time Warner's Woes with AOL: The Online Emperor Has No Clothes

London's The Times reports on yesterday's New York Post scoop that AOL parent Time Warner is mulling over a spinoff of the profitable but shrinking online service. The Times article speculates that possible suitors for a purchase may include Yahoo! and Barry Diller's InterActiveCorp. AOL suffers from any number of ills well analyzed elsewhere, but from our own perspective it's interesting to see how the marriage of media and online content interests continues to be rocky at best. The Web is not something that works best with forced selections of interest: its underlying premise is that it's okay for anyone to create any kind of content for anyone and for people to get it any way they want with near-zero distribution costs. Traditional media packaging falls apart in this environment. AOL's relentless commercials blare forth from televisions trying to sell us the Web as if it were a box of laundry detergent. Sorry, folks, but we all know that there's nothing in the box anymore. Now that AOL has abandoned content creation as an activity, all that's really left is an incredibly dumb user interface and a connectivity business that was commoditized years ago. Its only truly valuable content assets are its email and instant messaging accounts - certainly valuable to Yahoo!, MSN or others, as these are sources of valuable personal content that are difficult to migrate without acquisition. Fearing Microsoft, TW will probably go with Yahoo! The remaining online assets could be merged in with its cable assets as a complementary home portal for its cable modem subscribers, similar to the alliances that Yahoo! has struck with SBC and other cable operators. Unless one controls the distribution technology, one does not have a distribution business. Unless one controls the content, one does not have a publishing business. Time Warner appears to be facing these facts head on now and moving on to search for the soul of a 21st century media company in its more tangible assets.

Wednesday, March 17, 2004

Dialog Moves in Right Direction, But Is It Enough?

Dialog, a Thomson company, has announced the fruits of 3 years of labor in two guises this month. One, SmartTerms(TM), an internally-developed taxonomy of company names, industry names, geographic locations and subject terms; and the other an integrated platform for company reports, analyst reports, market research reports and news (i.e., Dialog Profound and Dialog NewsRoom). These are encouraging moves by the company that provides access to hundreds of databases via multiple services some of which require separate passwords and use different search interfaces, which were inherited through acquisitions. In his article in Information Today, Matthew McBride posits that perhaps the "technology tide [is] turning for Dialog?" Indeed, with these announcements, it is clear that they are investing resources in adding value to their database collection through improved functionality.

However, two key issues confront Dialog as it tries to maintain its position in the era of search engines and Web distribution. First is alternatives for contributing publishers. With publishers increasingly providing direct access to their content via the Web--whether it be free or for-fee--it becomes more difficult for them to justify sharing revenue according to the royalty models that were established before direct distribution was feasible. Second is pricing. Dialog adds fees on top of document charges. Again, with new, more efficient distribution models of aggregation developing on the Web, will customers continue to pay a premium for access via Dialog?

OeBF eBooks in the Public Library Conference: The Laboratory of Change in Publishing Starts to Bubble


[More complete coverage of this event is provided on our eResources premium weblog. Registration required.]

The Open eBooks Forum (OeBF) conference on eBooks in the public libraries held yesterday in New York City was one of those events where you can feel the surge of a movement beginning to realize its true strength in real-time. After several years of false starts, eBooks are starting to take hold in public libraries and are now providing library patrons in numerous major cities growing access to electronic materials. Numerous success stories at the conference pointed to rationalizing technologies, improving availability of content and an economic environment that has forced libraries to find more cost-effective methods of servicing patrons as leading factors in eBooks lending growth. Will this be the breakout year for eBooks? Probably not, as balky rights managment, entrenched library staffs and antiquated cataloging systems still pose significant challenges to a complete eBooks takeover any time soon. But where some see a continuing evolution of eBooks it's clear that this is a quiet revolution in the making, with irresistable forces beginning to compel libraries to embrace eBooks far more rapidly than may have been imagined. Not the least of the pressure for change is coming from the publishing industry, which looks to the strengthening success of rights-protected eBooks in libraries as a test bed for experimenting with new ways of deploying and monetizing premium content in models that can help them to define profits between the trusty but aging "one book, one user" model and the melee of open Web access. Community libraries offer publishers an environment in which they can work within a comfortably familiar distribution model to work out the details of how eBooks and other premium rights-protected content can best serve users. And then? The unmentioned factor at the conference was the Web itself, where eBooks already enjoy healthy sales. Content ecommerce portals could easily extend current commercial models to include lending and other mixed-use models that could tie into local library cards, corporate IDs or other forms of access subsidization. Just as corporate librarians were caught in the downdraft of technological and economic change that rendered many of them redundant, public and institutional librarians feeling the pinch of budgets and patrons going "Web first" for answers may find the publishers whose revenues depend on them finding the answers not hesitating to get content to their patrons by any means necessary if libraries fail to get their products to their markets in a manner that will sustain their profits. The power of local communities as a component of the Web's strength is likely to keep public libraries in the mix for a long time, but competition for servicing their patrons without their services will only increase unless they decide to go from evolution to revolution fairly soon.

Mutual Funds Pulling Plug on Investment Bank Research Fees

The New York Times reports on MFS Investment Management's decision to provide "execution only" pricing from investment banks and brokers who typically justify their premium commission fees for institutional trading based in part on the value of the research reports that are provided to them. Robert C. Pozen, the current nonexecutive chairman of MFS, was quoted by the Times as saying "We are valuing their research at zero.'' Mutual funds houses find themselves struggling with repercussions from accountability scandals which place pressure on their bottom lines, even as U.S.-based investment banks and brokers adjust to higher accountability and segregation for developing their research products in the wake of their own deceptions, so both sides are stuggling to come up with new execution models. The bloom has been off the rose for broker research for quite some time and investment banks will be hard pressed to stop this trend once it starts. As surely as the brass bull stands on Broadway, the herd mentality will push this trend along very rapidly. What options for broker research? This certainly doesn't bode well for outlets like TheMarkets.com, a portal supported by numerous investment banks to provide common access to broker research and encourage trades with participants. With broker research and analytics freely available from numerous vendor sources and electronic crossing networks driving margins on institutional trades ever lower, it will take a far more sophisticated approach to packaging content as a valuable part of a transaction to reverse this trend. U.S. Fair Disclosure rules make this ever harder to accomplish, but the future of premium financial content for securities clearly lies in defining its specific and contextual value at the time of execution. This push from mutual funds may be just the pressure required to send both investment banks and financial content vendors back to the drawing boards.

Friday, March 12, 2004

Search Advances Along Parallel Fronts

Cnet's article on competition in the Web search engine segment provides a very good overview of new players that are gaining attention due to specialized capabilities, such as personalization, improved display of search results, local search, and search specialization by content-type or industry. Red Herring [registration required] categorizes the emerging (and re-emerging) specialty search companies a bit differently, but combined, the two articles provide a fairly thorough picture of various businesses that focus on Web search technology. While there is a need for many types of companies that attack search relevancy from different angles, the piecemeal approach we're seeing of some companies focusing on display, some companies focusing on behavioral patterns, some on localized search, some on contextual relevancy makes it clear that consolidation of efforts is required in order to provide a better result for consumers and advertisers. Presumably the heightened press about the up-and-comers will facilitate Google's job of identifying which companies to buy--whether or not they have IPO cash available. But, even with an acquisitive Google, there remains room for specialized search companies that can provide a deep understanding of research or business applications for particular functional areas or other well-defined communities of users. In some cases, the specialized search technology can also serve the purpose of supplying a contextual advertising platform for targeted advertisers. iPhrase and IndustryBrains are two such companies that were mentioned in the above-referenced articles. Interestingly, both were cited for specialized search and contextual advertising for the financial information segment.

MetaCarta Mapping Out vContent in Public and Private Sectors

Sometimes you start in one direction and go in another, only to come back to where you started: MetaCarta is a company that originally had ideas back in the dot-com boom for relating content to geographic locations for major consumer portals, but some research funds from the U.S. government's DARPA defense research arm helped them to nurture their capabilities by focusing on more strategic needs. After numerous successful governmental installations, it appears as if their capabilities have matured to the point where it may be time for them to take a stab at the consumer market again. Directions Magazine provides some compelling examples of MetaCarta's ability to relate search terms to specific geographic coordinates using their own language processing algorithms and Web services-based search modules. Certainly useful for tracking down bad guys in obscure places and finding reports and research that relate to oil exploration sites, but how about for tracking down hard-to-find-widgets in Manhattan or pumping in Krispy Kreme coordinates automatically into a GPS-equipped PDA? Geographic context has been exploited in broad terms already in Web formats and with handhelds via online news outlets, directories and guides, but pinning down content to contexts that define the intersection of very broad content sets and very specific locations is still a growing opportunity in content markets. Getting content down to very finely defined geographic contexts is one key avenue for vContent aspirants, one of many "war benefits" from government-funded efforts that's benefitting both individuals and institutions in the private sector.

Wednesday, March 10, 2004

More Google-bashing in the News

A couple of stories this week point out chinks in Google's armor as the leading search engine and ad network. First, Forrester Research put out a press release that addresses the question "Where is Google Headed?" Charlene Li, principal analyst at Forrester predicts that Google will maintain its strong "position as a general search utility, [and] it will become the dominant pay-for-performance ad network..." The other story, from Business Week, recounts the recent hubbub about Google's removing an ad from the environmental group Oceana, which Google claims ran counter to its policy of "not accept[ing] advertising if the ad or site advocates against other individuals, groups, or organizations." Juxtaposing these two stories underscores how difficult it is to be in the number one position. In the Business Week article, John Palfrey from Harvard's Berkman Center for Internet & Society fears that "Google can make choices about what people see and what they don't see, and how it's ordered. As more and more people use Google to access the Internet, that definitely raises some important policy questions." Li points to Google's strength in the pay-for-performance ad market and its popularity as a general search engine as the two areas in which Google will continue to have success. But, she points out that competitors will have the edge in securing greater share of the specialized search segment, that is searches of highly structured databases or other non-textual content, such as directories or catalogs. [Note that the same sentiment has already been expressed in the Shore Content eCommerce premium Weblog. ] One theme connecting these two stories relates to how critical it is for Google to focus on maintaining its position as the preferred starting-point for a majority of Web searchers. The BusinessWeek article provides a hint at how quickly Google could fall out of favor if they don't provide the links to what people most want to see. The Forrester piece implies that since Google can't be the best search engine for all purposes, they should focus on becoming the dominant ad network, which is a far easier task if they maintain their position as the most extensive and most-used general search engine--with the associated traffic, ad inventory, and participating Web sites. In order to maintain that focus, Google will have to turn away some opportunities that detract from or conflict with their primary mission.

AIIMexpo: Process Efficiency Prepares the Way for a Content Revolution

This year's AIIMexpo Conference and Exhibition featured the usual collection of hardware and software tools to help streamline business processes for records management and workflow-oriented professionals. Last year's Sarbanes-Oxley mania has given way to a recognition that "SOX in a box" solutions were rather shortsighted given the broadening array of regulatory requirements that organizations must respond to (Rich Buchheim, Senior Director of Product Management at Oracle, noted in one panel that his company needs to track and respond to no fewer than 19 regulatory requirements for content retention and management), and that implementing internal policies is at least as important as external requirements. What firms seem to be realizing is that there is a need to have just the right retention and management policies: too much can lead to issues with legal discovery processes finding more than's desired, to little to regulatory exposure. This more systematic approach flowed through most product presentations and discussions, and has pushed content management suppliers such as Vignette and Stellant ever further into the world of integrated business solutions. In addition to these suppliers, though, is an increasing presence of content organization tools traditionally associated with Knowledge Management efforts: Convera had a major display pushing their taxonomy development and management capabilities, TheBrain EKP showed how their content visualization capabilities could assist in the mapping of organizational content policies and Autonomy was at least talking about their new alliance with content capture experts Captiva to provide enhanced categorization of incoming documents and forms. As more organizations take on a hyper-efficient content capture, creation and management environment they are laying the groundwork for a revolution in publishing that will make more content repurposeable for the right audiences in the right venues than ever before. In this year's economy and regulatory environment it's the hyperefficiency that gets the nod, but as these systems take hold prepare yourself for a true revolution in institutional publishing capabilities.

Monday, March 8, 2004

Factiva Moving Towards Google-like Search Simplicity

"Factiva in Bold Revamp," bellows the headline in the VNUnet story, revealing that later this month the Dow Jones/Reuters content aggregation alliance will be unveiling a new factiva.com subscriber home page that starts with a very simple, single-box search option, with its current more sophisticated search saved for an "advanced search" feature. Bold? Well, if it's a matter of a major aggregator deciding to do for its users what public Web search services have been doing for a decade, I suppose that we may have to adjust the dictionary a bit. If there's any boldness in the move it's that Factiva is beginning to focus in on the real battle at hand for content aggregators: how to appeal to a generation of users who expect content services to be as convenient as possible and to anticipate their needs and interests whenever feasible. The tailoring of desired content via sophisticated search interfaces is giving way increasingly to the concept of a "content concierge," a range of services that rely on input from users and subject experts as much as they do the capabilities of pure searching power. In the likes of both public engines like Google, Yahoo! and AskJeeves, as well as in newer services such as HighBeam and KeepMedia focused at individual researchers, aggregators have competition on all fronts for the attention of individuals who need fast answers from quality sources. The real "bold" move yet to be taken: when will a major aggregator meld in search results from sources other than their own content or a customer's content? When people want answers, they really don't care where they come from, as long as they're the right answers. Content concierges of the world, start your engines, the race is on.

Friday, March 5, 2004

Content Creation goes Mainstream

A fascinating report from the Pew Internet and American Life Project found "44% of Internet users have created content for the online world through building or posting to Web sites, creating blogs, and sharing files." This translates to 53 million adult Internet users. Admittedly, their definition of content creation is broad, and includes such activities posting photographs to the Web (21%) , posting written material to the Web (17%), maintaining their own Web sites (13%), posting comments to an online newsgroup (10%), as well as contributing to organizations and business websites. Interestingly, only 2% of the Internet users reported writing a blog, with 11% visiting blogs written by others, which is lower than would be expected given the press given to the blogging phenomena. The data from this survey is now almost a year old, so these numbers could now be higher in 2004. The personal nature of the content creation is woven throughout this thoughtful study, with descriptions of Power Creators (young and trying new technologies, searching for a job or place to live), Older Creators (experienced, highly educated, many retirees, and interested in family history), and Content Omnivores (full time workers, with children, using a wide variety of Internet services to juggle their their too full lives). It is, afterall, the actively engaged people who will shape the future of the Internet and drive the activities that will thrive in this online world.

Web Services: Delivery Beyond Browser-Based Portals

The death of the Web browser has long been predicted, and a recent report by Zapthink highlighted by Integration and Developer News and other outlets is calling today's browser-defined Web portals "wholly inadequate" for meeting the needs of today's increasingly sophisticated, standards-based content delivery mechanisms, Web services chief amongst them. The Zapthink paper sees these new capabilities creating demand for what they term "rich clients", a capability that falls between the "thin client" browser and the "thick client" locally installed software. This will not necessarily eliminate the browser-accessed portal, though, any more than automobiles eliminated the need for carriage houses: the technology changes, but the need for a familiar framework remains. What Web services' XML-based framework offers is the ability to deliver content in a way that's less rigidly coupled to a specific application's presentation of that content, enabling both browser-enabled delivery and delivery via more function-defined applications such as spreadsheets and word processing. These objects providing both content and functionality are key to the future of content for that very reason, providing the kind of useful and personalizable tools that people will want to have on their desktop for contextual use in a wide variety of settings and platforms. Content display application ju-jitsu will be an ongoing story over the next several years, but at the heart of it all will be the growing class of content objects that will be glad to oblige whoever satisfies the needs of a specific group of users most effectively.

Wednesday, March 3, 2004

Enter the NewsMaster: the InfoPro or Newsroom of the 21st Century?

The Robin Good weblog, edited by Luigi Canali De Rossi, had an interesting piece recently looking at the sea of information on the Web that search engines largely fail to sort out usefully for specific ongoing interests, a gap filled to some degree by today's webloggers. Webloggers do through their own intelligence what technology services have largely failed to achieve: provide an intelligent and proactive filtering and spinning of available content that's channeled via RSS syndication to audiences that appreciate a blogger's outlook. With some bravado and flair, Robin Good sees this as the beginning of a new kind of information professional called a "NewsMaster," defined as "...an individual capable of personally crafting RSS-based specialized information channels by utilizing technologies that allow [him or her] to select, aggregate, filter, exclude and identify quality news, information, content, tools and resources from the whole universe of content, news and information available on the Internet." This is, of course, what many info pros and librarians have been doing for centuries via other technologies, but as of late these professionals have been overshadowed by the capabilities of search engines to answer specific ad hoc demands. At the same time, today's commercial news publications try to answer these needs via their own reporting and editorial functions, leaving out huge swaths of online sources that could be extremely valuable. The future of content filtering and shaping for specific audiences certainly has a lot to learn from today's webloggers, but tomorrow's "content concierge" products are likely to be far more sophisticated than today's combination of far-reaching but dumb search engines and brilliant but inefficient webloggers, even if RSS continues to be a distribution channel of choice. There will be opportunities for technologists, journalists and info pros alike in this new arena, opportunities thain which they will probably benefit from their mutual cooperation.

Yahoo! Goes for the Deep Web - for a Fee

According to Reuters and numerous other sources report on portal provider Yahoo!'s plans to provide access to "Deep Web" sources not previously crawled by their search engine. Non-commercial sites may be crawled by Yahoo! without paying a fee, with plans already out for audio made available via Northwestern University from National Public Radio, the U.S. Library of Congress, the New York Public Library and the U.S.Supreme Court to be made available in Yahoo! search results. But commercial sources of content can be expected to pay a fee for crawling their deep content, based on a proportional formula that includes database size and click-through rates. Many content sources stored in databases are not crawled by public Web search engines, so this represents a great potential gain of high-value content that could position Yahoo! more favorably for searchers who are seeking more valuable sources. The implication from some reports, though, is that while the Yahoo! search engine will not favor these results in terms of search engine ranking, other sources may not be visited as often if they don't cough up the "deep" crawling fee - shades of GoTo.com, progenitor of Overture, Yahoo!'s new contextual ad division. It's easy to imagine how this could degrade into a war for paid search engine ranking that many thought was put to the side with contextual ads. But in more likelihood the real form taking shape is the outlines of what a premium content aggregator may look like in the future, collecting listing and click-through fees from publishers as a part of a greater content technology service rather than directly databasing content, as mentioned in this week's news analysis. It's far from clear that Yahoo!'s strategy will help them in the short run with Google, which already provides a wide range of deep Web content without charging its suppliers, but in the long run it may be part of the framework that begins to make sense for giving high-value and premium sources the context that they need in the search services most used by consumers and professionals.

Monday, March 1, 2004

In Premium Content: Hybrid Books, Thomson Takes Wachovia, ICAP Taps Moneyline, SEC's NMS Plans, Credit Changes for DJ and Reuters

In our eResources Marketplace premium weblog, Shore Senior Analyst Jean Bedord reviews the rise of new academic titles that are composed of multiple content media and formats. more...

In our Financial Content and Technologies premium weblog, Shore Senior Analyst Jack McConville reviews new deals for financial market data at Wachovia and Interdealer broker ICAP, the SEC's plans for a new National Market System to eliminate discrepancies between electronic and human-driven market pricing, downgrades to Dow Jones credit and upgrades to Reuters', a new peer group analysis tool from Thomson and new securities fraud investigations. more...

Boom with a Hook: Local Languages Thrive Where English Doesn't Dominate

Chinadotcom Corporation's annual earnings announcement is impressive not just for the growth in earnings for its china.com portal and other enterprises (103 percent over 2002) but as well for its healthy margins (46 percent, versus 37 percent in 2003) and first year of internationally recognized profitability - all on a portal where the words "china.com" in the logo are about the only English in sight. English may be the "Lingua Franca" of the moment, but where it is only one of many choices there is an increasing push for both content and tools that manage local language content more effectively. IT Web notes today, for example, how South Africa is pushing standards to have all of its eleven indigenous languages supported for government-produced documents, and how Verity and other vendors are responding to these requirements. Countries such as South Africa that are on the edge of many languages and cultures are trying to preserve culture and heritage in the face of ubiquitous English content, requirements that are increasingly cost-effective and necessary as content solutions reach beyond globalized users into the lives of people trying to make content work in their own communal frameworks. The Bible's Tower of Babel was made possible through a unified language, which was scattered into a thousand tongues to protect us from building our aspirations too high. Increasing localization of content may be having somewhat the same effect, requiring content providers to "speak the language" in both a cultural and linguistic context to service the true needs of increasingly sophisticated local markets.