dtSearch Publish - Summary

by dtSearch Corp. - Product Type: Application

Summary

dtSearch Publish by dtSearch Corp.

Quickly publish an instantly searchable document collection or mirror an existing Web site to CD/DVD. dtSearch Publish brings dtSearch’s powerful search tools to CD/DVD publishing. Over a dozen indexed and fielded data search options. Highlights hits in HTML, XML and PDF, while displaying links and images. Converts “Office,” ZIP, etc. files to HTML with highlighted hits. For end-users, running the CD/DVD requires no installation on the user’s hard drive.

dtSearch Product Line Overview

Over two dozen text search options

Most indexed searches take less than a second, even through multiple terabytes of text (unindexed searching also available)

Automatically recognizes popular file types: word processor, database, email, spreadsheet, ZIP, PDF, HTML, XML, and more

Highlights hits in retrieved files (for HTML and PDF, display includes highlighted hits and embedded links and images)

Free Upgrade to Version 7.0 for all current purchasers, including built-in spider, Unicode support and much more

dtSearch's proprietary indexing and searching algorithms allow for fast indexing and searching performance even over extremely large databases and other diverse collections of documents. The algorithms are engineered to maintain consistent indexing speeds regardless of the size of the document set. Indexed search speed is generally less than a second, even through multiple terabytes of text. dtSearch products also provide unindexed search options.

The dtSearch product line was developed to make efficient use of system resources under the Windows platform. Because dtSearch was designed to use limited system resources without compromising performance or searching capabilities, it is ideal for a wide range of environments -- ranging from high-traffic Web sites on multi-processor servers to handheld Windows CE devices -- where efficient use of memory is critical. (Version 7.0 also contains a Linux version of the dtSearch Text Retrieval Engine.)

dtSearch has over two dozen text search options, producing a combination unparalleled for intelligent searching. In addition to basic search options such as * boolean (and/or/not), * proximity, * wildcard, * segment, * numeric range, and * phonic, the dtSearch product line has the following special search capabilities:

Fuzzy Searching: dtSearch's proprietary fuzzy searching uses a unique algorithm to find search terms even if they are misspelled. Search fuzziness adjusts from 0 to 10 to correspond to the level of typographical or OCR errors in files. With a fuzziness level of 1, a search for "alphabet" would find "alphaqet." With a fuzziness level of 3, a search for "alphabet" would find not only "alphaqet" but also "alpkaqet." Note: fuzziness is not hardwired into the index, so the same index can handle both fuzzy and non-fuzzy searches. (Unindexed searches can also be fuzzy)

Concept/Synonym/Thesaurus Searching: dtSearch can perform automatic query expansion using a comprehensive semantic network of the English language with variable levels of expansion (user-defined synonyms, built-in synonyms, or built-in synonyms + related words).

Relevancy-Ranked Natural Language Searching: Natural language searches, also known as query-by-example, look for all words in a search request and return results based on automatic term weighting. Using the "Vector Space" method, dtSearch's relevancy ranking takes into account the frequency of hits, relative frequency of the search terms in the index, and hit density in retrieved documents.

Variable Term Weighting: dtSearch provides not only the automatic relevancy ranking in a natural language search request, but also the ability to specify relative weights. These weights can be positive or negative. For example, a user might assign a positive weight of 3 to the word "green" and a negative weight of five to the word "orange."

Field Searching: dtSearch automatically recognizes and indexes fielded data in such file formats as MS Word, Excel, PowerPoint, HTML, PDF and XML, making these fields separately searchable by field name (as well as accessible for full-text searching). Version 7.0 adds support for searching based on nested field criteria in XML documents.

Unicode Support: Version 7.0 adds Unicode support, which expands supported character sets to include Chinese and Japanese, while enhancing support for European language character sets.

dtSearch's search algorithms uniquely allow dtSearch to seemlessly handle multiple levels of search complexity. For example, a single search request can contain multiple levels of boolean, proximity, fuzzy, synonym, phonic, numeric range, etc., elements.

The dtSearch product line includes built-in file parsers that can index, search and display with "hit" highlighting a wide range of file types. Supported file types for the dtSearch product line include word processor, spreadsheet, database, RTF, PowerPoint, email message stores, ZIP, PDF, HTML and XML. Support includes Office 2000 files, as well as many legacy file types.

The dtSearch product line can display a retrieved Web page (HTML or PDF) with all embedded links and images intact. For Web use, the dtSearch product line also has "on-the-fly" Web conversion to HTML for documents in supported non-HTML formats, including "hit" highlighting.

Version 7.0 of the dtSearch product line also includes a Web spider for retrieving remote Web pages. The spider can index and search remote Web pages to any specified level of depth. dtSearch can then display Web pages retrieved in a search with highlighted "hits" and all embedded links and images intact.

What's New in dtSearch V7.5x?

New 64-bit version: The new release includes a native 64-bit version of the dtSearch Engine for Win & .NET (for .NET 2.0/3.0) for developers to integrate into web-based and other applications. The 64-bit version provides full API access to dtSearch's terabyte indexer and search functionality, file format and database support (including SQL BLOB data).

International language enhancements: dtSearch products include international language support through Unicode, covering hundreds of international languages. The new version adds improved searching of Chinese, Japanese and Korean text presented without spaces between words. The new version also offers improved developer API integration with third-party international language morphological analyzers like those from Basis Technology

What's New in dtSearch V7.43?

Fixed bug in PDF file parser affecting decoding of CID fonts in PDF files

Fixed error extracting item from TAR file to hit-highlight after search

Added detection of the following file types with missing or incorrect filename extensions: Microsoft Word 2003 XML files, Microsoft Excel 2003 XML files.

Fixed error indexing using data source API under WebSphere

Fixed extra spacing in output when HTML converted to UTF-8 text

What's New in dtSearch V7.40?

Automatic recognition of dates, email addresses, and credit card numbers in text

Support for Vista XMP metadata

Support for PowerPoint 2007 (*.pptx). (The product line already supports Word 2007 (*.docx) and Excel 2007 (*.xlsx))

Support for Vista XML Paper Specification (*.xps) documents

A new IndexCache object in the .NET 2.0 API, and dtsIndexCache object in the C++ API of the dtSearch Engine. The new objects enable much faster searching when a series of searches must be done against a small number of indexes

What's New in dtSearch V7.30?

Enhancements (All products)

Added preliminary support for Word 2007 (*.docx) and Excel 2007 (*.xlsx) based on the current Office 2007 beta and available documentation.

Added support for JPG and TIFF metadata, including EXIF and IPTC fields.

Unicode filtering file parser can handle individual documents larger than 2 Gb, and support for files larger than 2 Gb added to the extext.exe utility

Improved handling of partially inaccessible email files. In previous versions, if an email had encrypted or corrupt data (for example, an encrypted attachment), the whole email was reported as encrypted or corrupt. In this version, the readable portion of the message is indexed and the unreadable portion is separately reported as a partially encrypted or partially unreadable file. This change applies to Outlook messages, TNEF files, .eml files, MBOX archives, and .msg files.

Enhancements (dtSearch Engine)

Beta x64 (64-bit) versions of the dtSearch Indexer and dtSearch Engine (dtIndexer64.exe, dtengine64.dll, and dtSearchNetApi2.dll. The index format and APIs (C++, COM, and .NET) are identical to the 32-bit version. The 64-bit components are in a separate download file (dtSearch64_730.exe) with the same installation password as the dtSearch Engine SDK.

Added alternative PDF highlighting mechanism for client-based applications (see "Highlighting Hits in PDF files" in the API Overviews section for details)

Added ListIndexJob object to the .NET 2.0 API to list files, words, or fields in an index (see dtSearchNetApi2.chm for API reference)

Added dtsListIndexIncludeDocId flag for dtsListIndexJob and ListIndexJob to provide a quick way to list all documents in an index and the doc id for each document

C++ API Changes to support 64-bit file sizes in dtsInputStream (added size64 and seek64), dtsInputStreamReader, dtsFileInfo (added size64), dtsSearchResultsItem (added size64). These changes preserve binary compatibility for the dtSearch Engine DLL, but some C++ code may trigger new warnings when compiled because of 64-bit values returned.

Added dtsIndexKeepExistingDocIds flag to specify that, when compressing an index, the indexer should not remap document ids, so document ids will be unmodified in the index once compression is done.

Fixes and minor enhancements

What's New in dtSearch V7.20?

New file parsers for OpenOffice documents, spreadsheets, and presentations (*.sxw, *.sxc, *.odt, *.ods, etc.), covering OpenOffice version 1 and OpenOffice version 2 (the "Open Document Format for Office Applications")

New file parsers for the Microsoft Office XML formats (Microsoft Word 2003 XML and Microsoft Excel 2003 XML)

Added "Opening containing folder" in right-click menu for retrieved items

Improved reporting of errors that occur when copying files in Edit > Copy File(s)

dtindexer.exe: added /caf and /cat command-line option to cache text (/cat) or cache original files (/cad), when creating indexes using the command line, and /recog to recognize an index.

Added Help > Check For Updates feature to automatically download new versions

The new release includes major enhancements to the dtSearch product line's display of MS Word, Excel and PowerPoint documents. The new release also includes enhancements for indexing and searching Outlook message stores. Finally, the new release includes an additional feature for forensics usage.

PartNumbers: PC-505905-169350 505905-169350 PC-505905-169352 505905-169352

PurchaseOptions: dtSearch Publish V7.54 (includes 1 year support and updates) Search only functionality for up to 250 CD/DVDs , dtSearch Publish and Web Combo V7.54 (includes 1 year support and updates) Search only functionality for up to 250 CD/DVDs and 1 Server License

Resources: Browse the dtSearch V7.0 Press Release, Read the dtSearch Case Studies document - Requires Acrobat Reader, Read the Integrating Query of Relational and Textual Data in Clinical Databases Article, Read the dtSearch Features Map Document - Requires Acrobat Reader, Read the dtSearch Reviews Document - Requires Acrobat Reader, Read the dtSearch Distributed Searching white paper - Requires Acrobat Reader, Read the dtSearch Alternative Information Distribution white paper - Requires Acrobat Reader, Read the dtSearch Text Query white paper - Requires Acrobat Reader, Read the dtSearch Web Based Data Management white paper - Requires Acrobat Reader, Read the Crossing the Full-Text Search / Fielded Data Divide from a Development Perspective White Paper, Read the dtSearch Unicode and Text Retrieval White paper - Requires Acrobat Reader, Read the dtSearch User Manual - Requires Acrobat Reader, Read the dtSearch Adds Basis Technology API Support for Enhanced Chinese, Japanese and Korean Language Text Retrieval Press Release, Download the dtSearch V7.54 evaluation on to your computer - Expires After 30 Days

Operating System for Deployment: Windows Vista, Windows XP, Windows ME, Windows 2000, Windows 98, Windows NT 4.0, Windows 95

Architecture of Product: 32Bit

Product Type: Application

General: Supports Apartment Model Threading, Microsoft Transaction Server Compatible (MTS)

Compatible Containers: Microsoft Visual Studio .NET 2003, Microsoft Visual Studio .NET, Microsoft Visual Basic .NET 2003, Microsoft Visual Basic .NET, Microsoft Visual C++ .NET 2003, Microsoft Visual C++ .NET, Microsoft Visual C# .NET 2003, Microsoft Visual C# .NET, Microsoft Office 2000, Microsoft Office 97, Microsoft Internet Explorer 5.0, Microsoft Internet Explorer 4.0, C++Builder 4, C++Builder 3, Delphi 5.0, Delphi 4.0, Delphi 3.0, .NET Framework 1.1, .NET Framework 1.0

Product Class: Component Development Tools, .NET Development Tool

Search Items: New Version Jun 03, New Version Dec 03, New Product Aug 04, New Product Sep 04, New Product Feb 05, New Product Aug 05, New Product Sep 05

Keywords: Search searching searches dt search dtsearch Professional Partner dtSearch Publish, dtSearch 7 Publish

Product Search

Enter search words: