Metadata is one of those terms that's likely to get traditional publishers' eyes glazing before you've even finished saying it, but it happens to be the content that's going to determine much of what powers profitability in publishing over the next decade. Broadly speaking metadata is the categorization and tagging of content that enables it to be referenced easily and to reference other content easily. If the easy money to be made in searching documents on the Web has been made already by Google, the next generation of publishing services will be providing tools than enable more structure to be added to content, both for providing more rich content that search engines will like and to provide enough richness that people looking at a metadata-enriched Web page won't have to go hunting via search engines for related content.
Reuters has launched recently their Calais open API initiative that holds great promise for them becoming a major player in leveraging metadata generation as a tool to put them at the heart of increasingly structured Web content. Calais provides tools that will enable publishers and applications developers to pass their content through a content analysis engine provided by Reuters' ClearForest semantic content processing tool and to get well-structured metadata returned for free. What's the payback for Reuters? To be the first to have this information, of course. With its centuries-old traditions of breaking news and real-time market data, Reuters is far from being a stranger to the value of being the first one obtaining critical information.
In helping the Web to gain semantic structure Reuters can become in theory via Calais the one best suited to help people take advantage of thst structure. Will this become a reality? While it's not likely to take off quickly I think that it's likely that Calais may enjoy a very comfortable position as a pioneer in open metadata generation for some time. The more time in which they can build up metadata without much opposition - lots of people will still be in the "old media" mindset of trying to quantify short-term profits for such a move - the more time that they will have to build value-add services that build on both the information's value as a real-time update stream as well as its value as a tool to enable people to make sense of an ever-expanding Web. Metadata also helps search engines and contextual ad services to match content to queries more effectively, so the What's-In-It-For-Me might be very valuable to publishers, especially publishers of social media who don't have the budget to afford their own semantic metadata generation systems.
Publishers place a lot of emphasis on copyright, but as the financial market data business has shown through the years copyright is of little value if you can't get your content to the right people in time for it to make a difference to people. Focusing on metadata will enable Reuters to start indexing the Web in a more organized manner and to use that indexing to develop information products that will become in time at least as valuable as those that it has developed for the financial securities marketplace. It's no accident that Reuters is using a silhouette of a pigeon in the logo for Calais. Julius Reuter made his first stab at electronic publishing by closing the gap between telegraph stations carrying stock quotes by tying them to carrier pigeons. Sometimes filling the gaps in content services that others wait to get filled can have profound consequences.