Wednesday, October 06, 2010

IBM Classification Module and Content Extractor

IBM® Classification Module
IBM® Classification Module helps organize unstructured content by analyzing the full text of documents and e-mails and applying rules that automate classification decisions.
IBM Classification Module reduces the burden of manual decision making that is done by employees by accurately and automatically organizing information.
It is embedded with natural language processing and semantic analysis capabilities,
CLASSIFICATION WORKBENCH
An application that is used to create and analyze knowledge bases and decision plans. With Classification Workbench, you can also evaluate system performance by importing analysis data and viewing reports and graphical diagnostics.
KNOWLEDGE BASE
A single file encapsulating data that is required by the Classification Module for accurate content-based classification
DECISION PLAN
A collection of rules built in Classification Workbench that determine how the Classification Module classifies content items such as documents or e-mails. Each rule consists of one trigger and one or more actions.
CLASSIFICATION CENTER
A Web application provided with the IBM FileNet P8 integration that is used to manage the classification processes. You can use the Classification Center to determine the content to be classified, specify classification options (such as the decision plan to use and various runtime preferences), monitor classification activity, and view the classification results
 
Content Extractor
A command-line tool provided with the IBM FileNet P8 integration that is used to extract the content from an IBM FileNet P8 object store. You can import the extracted content into Classification Workbench and use it to train a knowledge base or provide test data for a decision plan.

It uses a properties file where option for what to be extracted is specified and extracts the document in XML format.

1 comment:

Web Content Extractor said...

Very good site you have created. IBM Classification Module helps organize unstructured content by analyzing the full text of documents and e-mails and applying rules that automate classification decisions. Thank you very much for posting this!!!!