Digital Edition

Improving the Productivity of Knowledge Workers
Importance of automated content classification

What Is Content Classification
The term content classification is best understood in an enterprise information context, defined by the following concepts.

Taxonomy is the hierarchical representation of topics of interest. For example, a basic taxonomy might consist of a class called "Transport," which might have subclasses "Air Transport" and "Land Transport." Then "Land Transport" might in turn have subclasses "Bus" and "Car." This hierarchy means that a "Car" is a type of "Land Transport," and is also a type of "Transport."

Ontology defines the relationships between the topics of interest.

Content classification is the process of analyzing a document and adding metadata 'tags' that describe that document that is sourced from a taxonomy or other form of controlled vocabulary.

Content Classification in Enterprises
Today's enterprises deal with data in which 80% is unstructured. There is a tremendous amount of intelligence and insight held in this massive amount of unstructured data. However, most enterprises depend on their information worker's knowledge to bring meaning to the unstructured data. In most of the enterprises, relevancy is entirely subjective to the individual who is performing the search. Only each individual can judge how relevant a particular bit of information is to what they are attempting to discover.

As evident enterprises needed to augment their knowledge workers with insights that go beyond their human expertise, so that they find and analyze the topics of interest and enrich them further for end customers.

One of the typical applications of content classification usage in enterprise is how the enterprises analyze the warranty and customer complaints towards improving the product quality.

  • Problems occur in different geographies and the same problem scope is represented differently
  • Most of the time problems are not grouped into a larger category, due to the lack of taxonomies within the problem area
  • Problems cannot be associated with each other since human intervention is needed to associate two similar problems
  • This results in lost opportunities to identify true problem areas or the wrong classification of problems, ultimately impacting the product quality, which results in product recalls and lost market share.

The following are the some of the players and their products that support content classification. Adopting these products and similar ones will help the enterprises to best utilize the potential of their information workers, while improving their productivity and reducing the manual work. This will also facilitate the enterprises to keep their core knowledge inside automated business rule processing machines than with the human intelligence.

Smartlogic Semaphore Content Intelligence Platform
Semaphore, the Content Intelligence Platform from Smartlogic that works with an enterprise‘s existing search and content management systems, organizes business-critical content by automatically tagging and categorizing it - enabling precise searching, guided navigation, and effective management and governance.

Semaphore consists of four core modules:

  1. Ontology Server & Manager - allows multiple users to collaborate on the development and management of ontologies which capture the essential topics, resources and vocabulary for the business.
  2. Advanced Linguistics Pack - provides text mining and entity extraction based on part-of-speech tagging.
  3. Classification Server - a rules-based semantic classification engine providing accurate metadata tagging of content in 26 languages.
  4. Semantic Enhancement Server / Search Application Framework - enhance search engines (e.g., Microsoft SharePoint Search, Microsoft FAST, Lucene/Solr, Google Search Appliance, etc.)

Using the above core modules, Semaphore is an enterprise semantic platform that uniquely captures an organization‘s subjects and topics into a taxonomy or ontology [model] and enhances traditional information management systems like search, content management and business workflow engines by adding advanced content classification, metadata enrichment, and navigation capabilities to deliver a more complete enterprise information management experience.

Additional information about the product can be obtained from their website:

IBM ECM - Classification Module
IBM's Content management portfolio is added with features for Content classification. We can categorize documents by using IBM Classification Module. The Classification Module annotator uses the capabilities of Classification Module to classify content into categories and generate metadata information that can be used for facets or keywords in Content Analytics.

Much like the Smartlogic Platform, IBM Classification Module has the following core components.

  • Classification Workbench:
  • - Taxonomy Proposer
  • _ Classification Module server:
  • - Management Console
  • - Client APIs
  • _ IBM FileNet P8 integration asset:
  • - Classification Center
  • - Content Extractor

The Taxonomy Proposer, which is installed with the Classification Workbench, allows you to discover new categories in an uncategorized or partially categorized body of documents. The Taxonomy Proposer uses custom clustering algorithms to analyze and group similar documents to help you to create a taxonomy for your content.

Typical of any IBM product, we get a lot of redbooks and materials available to go deep into this. Further information can be found at the IBM Website:

Most enterprises wanted to make a difference by providing a unique value proposition for their information delivery. Enterprises invest heavily in their knowledge workers or information analysts to provide meaningful insight into their unstructured data; however, this process is not repeatable and prone to failures. Content Classification Automation solutions as identified in the above will enable the enterprises to be more efficient.

About Srinivasan Sundara Rajan
Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

Subscribe to the World's Most Powerful Newsletters


Today, we have more data to manage than ever. We also have better algorithms that help us access our...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with e...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: D...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held Novemb...
DXWorldEXPO LLC announced today that ICOHOLDER named "Media Sponsor" of Miami Blockchain Event by Fi...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22...
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @...
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitori...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news an...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018,...
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing w...
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear...
"We host and fully manage cloud data services, whether we store, the data, move the data, or run ana...
DXWorldEXPO LLC announced today that Telecom Reseller has been named "Media Sponsor" of CloudEXPO | ...
Enterprises are striving to become digital businesses for differentiated innovation and customer-cen...
Enterprise architects are increasingly adopting multi-cloud strategies as they seek to utilize exist...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As au...
"Calligo is a cloud service provider with data privacy at the heart of what we do. We are a typical ...
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use ...
Discussions of cloud computing have evolved in recent years from a focus on specific types of cloud,...