经 dtSearch Corp. - 产品类型: 构件 / 应用程序 / .NET 类 / ActiveX OCX / ActiveX DLL / DLL / VC++ 类库 / JavaBean / Java类 / ASP
dtSearch Web with Spider (32-bit / 64-bit) by dtSearch Corp.
URLs: dtsearch-web-spider, dtsearch web spider, dtsearchwebspider, dtsearch
Add search functionality to your Web site. dtSearch Web with Spider quickly publishes with instant text searching a wide variety of content to an IIS Web site. Most searches take less than a second, even through multiple terabytes of text. Over a dozen indexed & fielded data search options. Highlights hits in HTML, XML & PDF, while displaying links & images. Converts word processor, database, spreadsheet, ZIP, etc. files to HTML, with highlighted hits. Operating through dtSearch Web, the spider can expand the scope of the searchable database beyond a site's own data to content on a third-party Web sites (both publicly available and secure), including support for a wide variety of content (HTML, XML, ASP.NET, MS CMS, SharePoint, etc.) and WYSWYG hit highlighted displays. Optional API for SQL, Java, C++, .NET available. Now includes a 64 bit version. dtSearch supports Microsoft Access, Excel (*.xls, *.xlsb, *.xlsx), Word (*.doc, *.docx, *.rtf), and PowerPoint (*.ppt, *.pptx) files created by Office 2010.
dtSearch Web with Spider (32-bit / 64-bit) Main Features
Publish instantly searchable data to your website (supports IIS server)
25+ fielded & full-text search options (supports hundreds of international languages)
File parsers/converters highlight hits in popular file types
Spiders static & dynamic web data; hit-highlighted WYSIWYG displays
Fast, precision searching
Over two dozen text search options
Most indexed searches take less than a second, even through very large databases
Also has unindexed searching
Automatically recognizes word processor, database, spreadsheet, email, PDF, ZIP, HTML, XML, Unicode files & more
FindPlusdistributed searching extends the reach of a single search request to remote enterprise servers
Point and click setup
Highlights hits in HTML and PDF while keeping embedded links and images intact
Converts other file types to HTML for display with highlighted hits
The new release adds FindPlusdistributed searching, a Web spider, enhanced XML support and Unicode support, to improve access to information throughout an organization. The new release also offers API enhancements, expanding the dtSearch developer components utility for use with a wide variety of programming languages.
dtSearch products offer instant indexed (and slower unindexed) searching of large document collections. Proprietary indexing and searching algorithms maintain a fast rate of indexing and virtually instantaneous searching over very large document collections.
Over two dozen text search options can work alone or in combination for unmatched intelligent searching. Search features include: fuzziness adjustable from 0 to 10, synonym/concept/thesaurus, boolean, phrase, wildcard, proximity, stemming, numeric range, natural language relevancy-ranked by hit density and rarity, variable term weighting, indexed and unindexed searching, and more.
dtSearch products automatically recognize a wide variety of document types, including word processor, database, spreadsheet, ZIP, XML and more. The products highlight hits in HTML and PDF while keeping all embedded links and images intact. The products have built-in file converters to convert other popular file types to HTML for display with highlighted hits.
FindPlusdistributed searching is an integrated feature of dtSearch Desktop, Web and Network that conveniently allows a single search request to span everything from local drives to remote servers. Operating through a single user interface, FindPlus enables indexed searching of files and other data throughout an organization, without the need to collect the data in a monolithic repository. Because FindPlus uses an XML-based protocol for exchanging and aggregating search information, developers using the dtSearch Engine can also easily incorporate this capability into their own applications.
In addition, enhanced XML support provides a way to combine data from any source, while retaining the ability to search on field and table information. XML is increasingly becoming a universal data format. However, other search engines do not fully incorporate the hierarchical structures in XML data, effectively reducing XML to "flat" text. In contrast, dtSearch can perform indexed searches using the full range of dtSearch features across an entire XML database, or limited to a specific combination of fields or sub-fields, with no sacrifice in speed.
Other new features include:
A web indexing spider, providing a way to use other web sites as instantly searchable resources
Unicode support, enabling indexing and searching of text data in nearly any language
New developer features include:
Java support through a JNI interface
More sample code in C++, Visual C++, Visual Basic and Delphi
Improved multithreaded operation for use with ASP and .NET
More sample source code to dtSearch Web, for both ASP and ISAPI-based versions
Improved indexing and searching of ActiveX and other data sources (such as SQL databases), with hit highlighted search results display
Search results serialization as an XML or URL-encoded stream
dtSearch supports Microsoft Access, Excel (*.xls, *.xlsb, *.xlsx), Word (*.doc, *.docx, *.rtf), and PowerPoint (*.ppt, *.pptx) files created by Office 2010.
What's New in dtSearch 7.64?
Enhancements (dtSearch Engine)
Added dtsSearchLanguageAnalyzerSynonyms flag to enable using a language analyzer to generate morphological variations on a search term at search time. When this flag is set, the language analyzer is called for each word or phrase in the search request. The flag dtsLaInputIsSearchTerm is passed to the language analyzer in dtsLaJob.flags, so the language analyzer knows why it is being called.
Added dtssGetWordBreaker API function to provide direct access to the dtSearch Engine's internal word breaker using the language analyzer API. For sample code demonstrating how to use this API, see the WordBreak example in examples\vc8\WordBreak.
Added more structural information to the output generated by conversion to the it_ContentAsXml file format.
Added to COM interface: WordListBuilder.ListFieldValues, WordListBuilder.SetFilter, and IndexJob.EnumerableFields.
Added dtsListIndexSkipNoiseWords flag for ListIndexJob to list words in an index without including any noise words.
Added dtsoFfSkipDataSourceFields flag for Options.FieldFlags to prevent DocFields values from appearing in FileConverter output
Fixes and minor enhancements
Fixed incorrect display of CreationDate and ModDate properties in PDF files
Fixed incorrect hit highlighting when Unicode Filtering options at search time different from options used to index a file. To ensure consistent options, Unicode Filtering options are stored in the index when the index is created, in the index_a.ix file.
Fixed error updating index when directory specified for temporary files is inaccessible.
Fixed index merge bug causing "Inconsistent doc ids from target index" error during merge.
Fixed two search report bugs causing incorrect hit highlighting.
Improved formatting of documents converted from Ami Pro and Quattro Pro to HTML
Added automatic detection of gb2312 and JIS encoding.
Added automatic detection of XyWrite, XBase, WordStar 3.x, and WordPerfect 4.2 and TAR files.
Improved reporting of file types by FileConverter.DetectedTypeId, providing much more specific information about Microsoft Word versions and adding type detection for additional file formats
Added support for text extraction from Adobe Framemaker MIF, XFA form templates in PDF files, and Visio XML files
Fixed "Excessive nesting" error indexing OpenOffice document due to bug parsing table structure
Fixed RTF file parser bug affecting handling of the \upr tag
Other file parser bug fixes affecting Multimate, Lotus 1-2-3, PDF, Word, PowerPoint
What's New in dtSearch 7.63?
Added IndexFileInfo.UserFields in .NET API to provide access to stored fields through the IIndexStatusHandler callback interface during indexing
Added dtsnIndexDeletedFileRemoved, dtsnIndexListedFileRemoved, and dtsnIndexListedFileNotRemoved notifications to the indexing status callbacks to notify the calling application when files are removed from the index during indexing or when an attempt to remove a listed file fails
What's New in dtSearch 7.62?
Regular Expression searching extended to support TR1 regular expressions
Added new cmap files for PDF extraction
Reduced Memory use for searches that retreive large numbers of documents with a relatively small MaxFilesToRetrieve value
What's New in dtSearch 7.61?
Added new user interface appearance options and updated toolbar icons
What's New in dtSearch 7.5?
New dtSearch Desktop with Spider 64-bit version: The new release includes a native 64-bit version of the dtSearch Engine for Win & .NET (for .NET 2.0/3.0) for developers to integrate into web-based and other applications. The 64-bit version provides full API access to dtSearch's terabyte indexer and search functionality, file format and database support (including SQL BLOB data).
International language enhancements: dtSearch products include international language support through Unicode, covering hundreds of international languages. The new version adds improved searching of Chinese, Japanese and Korean text presented without spaces between words. The new version also offers improved developer API integration with third-party international language morphological analyzers like those from Basis Technology
What's New in dtSearch 7.43?
Fixed bug in PDF file parser affecting decoding of CID fonts in PDF files
Fixed error extracting item from TAR file to hit-highlight after search
Added detection of the following file types with missing or incorrect filename extensions: Microsoft Word 2003 XML files, Microsoft Excel 2003 XML files.
Fixed error indexing using data source API under WebSphere
Fixed extra spacing in output when HTML converted to UTF-8 text
What's New in dtSearch 7.40?
Automatic recognition of dates, email addresses, and credit card numbers in text
Support for Vista XMP metadata
Support for PowerPoint 2007 (*.pptx). (The product line already supports Word 2007 (*.docx) and Excel 2007 (*.xlsx))
Support for Vista XML Paper Specification (*.xps) documents
A new IndexCache object in the .NET 2.0 API, and dtsIndexCache object in the C++ API of the dtSearch Engine. The new objects enable much faster searching when a series of searches must be done against a small number of indexes
What's New in dtSearch 7.30?
Enhancements (All products)
Added preliminary support for Word 2007 (*.docx) and Excel 2007 (*.xlsx) based on the current Office 2007 beta and available documentation.
Added support for JPG and TIFF metadata, including EXIF and IPTC fields.
Unicode filtering file parser can handle individual documents larger than 2 Gb, and support for files larger than 2 Gb added to the extext.exe utility
Improved handling of partially inaccessible email files. In previous versions, if an email had encrypted or corrupt data (for example, an encrypted attachment), the whole email was reported as encrypted or corrupt. In this version, the readable portion of the message is indexed and the unreadable portion is separately reported as a partially encrypted or partially unreadable file. This change applies to Outlook messages, TNEF files, .eml files, MBOX archives, and .msg files.
Enhancements (dtSearch Engine)
Beta x64 (64-bit) versions of the dtSearch Indexer and dtSearch Engine (dtIndexer64.exe, dtengine64.dll, and dtSearchNetApi2.dll. The index format and APIs (C++, COM, and .NET) are identical to the 32-bit version. The 64-bit components are in a separate download file (dtSearch64_730.exe) with the same installation password as the dtSearch Engine SDK.
Added alternative PDF highlighting mechanism for client-based applications (see "Highlighting Hits in PDF files" in the API Overviews section for details)
Added ListIndexJob object to the .NET 2.0 API to list files, words, or fields in an index (see dtSearchNetApi2.chm for API reference)
Added dtsListIndexIncludeDocId flag for dtsListIndexJob and ListIndexJob to provide a quick way to list all documents in an index and the doc id for each document
C++ API Changes to support 64-bit file sizes in dtsInputStream (added size64 and seek64), dtsInputStreamReader, dtsFileInfo (added size64), dtsSearchResultsItem (added size64). These changes preserve binary compatibility for the dtSearch Engine DLL, but some C++ code may trigger new warnings when compiled because of 64-bit values returned.
Added dtsIndexKeepExistingDocIds flag to specify that, when compressing an index, the indexer should not remap document ids, so document ids will be unmodified in the index once compression is done.
Fixes and minor enhancements
What's New in dtSearch 7.20?
New file parsers for OpenOffice documents, spreadsheets, and presentations (*.sxw, *.sxc, *.odt, *.ods, etc.), covering OpenOffice version 1 and OpenOffice version 2 (the "Open Document Format for Office Applications")
New file parsers for the Microsoft Office XML formats (Microsoft Word 2003 XML and Microsoft Excel 2003 XML)
Added "Opening containing folder" in right-click menu for retrieved items
Improved reporting of errors that occur when copying files in Edit > Copy File(s)
dtindexer.exe: added /caf and /cat command-line option to cache text (/cat) or cache original files (/cad), when creating indexes using the command line, and /recog to recognize an index.
Added Help > Check For Updates feature to automatically download new versions
The new release includes major enhancements to the dtSearch product line's display of MS Word, Excel and PowerPoint documents. The new release also includes enhancements for indexing and searching Outlook message stores. Finally, the new release includes an additional feature for forensics usage.
Add search functionality to your Web site.
Operating System for Deployment: Windows 7, Windows Vista, Windows XP, Windows ME, Windows 2000, Windows 98, Windows NT 4.0, Windows 95
Architecture of Product: 32Bit, 64Bit
Product Type: Component, Application
Component Type: .NET Class, ActiveX OCX, ActiveX DLL, DLL, VC++ Class Library, JavaBean, Java Class
Application Type: ASP
Built Using: MFC V4.2 / V6.0, ActiveX Template Library (ATL), Java 2 SDK (JDK 1.2)
General: Internet Enhanced, Supports Apartment Model Threading, Supports Free Threading, Microsoft Transaction Server Compatible (MTS)
Compatible Containers: Microsoft Visual Studio 2010, Microsoft Visual Studio 2008, Microsoft Visual Studio 2005, Microsoft Visual Studio .NET 2003, Microsoft Visual Studio .NET, Microsoft Visual Studio 6.0, Microsoft Visual Studio 97, Microsoft Visual Basic 2010, Microsoft Visual Basic 2008, Microsoft Visual Basic 2005, Microsoft Visual Basic .NET 2003, Microsoft Visual Basic .NET, Microsoft Visual Basic 6.0, Microsoft Visual Basic 5.0, Microsoft Visual C++ 2010, Microsoft Visual C++ 2008, Microsoft Visual C++ 2005, Microsoft Visual C++ .NET 2003, Microsoft Visual C++ .NET, Microsoft Visual C++ 6.0, Microsoft Visual C++ 5.0, Microsoft Visual C# 2010, Microsoft Visual C# 2008, Microsoft Visual C# 2005, Microsoft Visual C# .NET 2003, Microsoft Visual C# .NET, Microsoft Visual InterDev 6.0, Microsoft Visual InterDev 1.0, Microsoft Visual FoxPro 6.0, Microsoft Office 2010, Microsoft Office 2007, Microsoft Office 2003, Microsoft Office XP, Microsoft Office 2000, Microsoft Office 97, Microsoft Access 2007, Microsoft Access 2003, Microsoft Access 2002, Microsoft Access 2000, Microsoft Access 97, Microsoft Outlook, Microsoft Internet Information Server 6.0, Microsoft Internet Information Server 5.0, Microsoft Internet Information Server 4.0, Microsoft FrontPage, Microsoft Internet Explorer 8.0, Microsoft Internet Explorer 7.0, Microsoft Internet Explorer 6.0, Microsoft Internet Explorer 5.5, Microsoft Internet Explorer 5.0, Microsoft Internet Explorer 4.0, C++Builder 5, C++Builder 4, C++Builder 3, CodeGear C++ (formerly Borland), CodeGear C++ 5.0 (formerly Borland), JBuilder 4, JBuilder 3.5, JBuilder 3, JBuilder 2, JBuilder 1, .NET Framework 2.0, .NET Framework 1.1, .NET Framework 1.0
Product Class: User Interface Components
Keywords: dtSearch Web, dtSearch 7 Web Spider
dt search dtsearch
Search searching searches
Conversion Convert converts converting
Database DB Management
SQL query language
XML Extensible Markup