Publisher: DBI Technologies Primary Category: Search Product Type: Component / Managed/Unmanaged Code - without COM / DLL
Powerful text summarization engine. Extractor is a software text summarization engine. It consumes documents (text, html, email) and using a patented genetic extraction algorithm (GenEx) analyzes the recurrence of words and phrases, their proximity to one another, and the uniqueness of the words to a particular document. The engine returns a list of key words and phrases found in the document together with their relative ranking (how many times was the word/phrase found in the document) along with contextual links back to the position of the key word/phrase in the document itself.
Add instant searching of terabytes of text and file format support to your Linux applications. dtSearch Text Retrieval Engine for Linux provides a way for C++ and Java developers to incorporate dtSearch text retrieval functions into their applications for the Linux platform. No dtSearch end-user products are currently available for Linux so some C++ or Java programming is needed to make any use of the Linux version. dtSearch Engine for Linux includes all features of the Windows version except the COM interface and support for indexing of Microsoft Access databases via ODBC.