<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ComponentSource Topic | dtSearch Corp.</title><link>http://www.componentsource.com/topics/dtsearch/index.html</link><description></description><language>en-us</language><lastBuildDate>Fri, 10 Feb 2012 00:00:07 GMT</lastBuildDate><copyright>(C) Copyright 1996-2012 ComponentSource.</copyright><atom:link href="http://www.componentsource.com/topics/dtsearch/rss.xml" rel="self" type="application/rss+xml"/><item><title>dtSearch excludes noise words</title><link>http://www.componentsource.com/news/2010/05/20/dtsearch-publish.html?rc=ni_4107</link><description>&lt;div class="image"&gt;&lt;img src="http://ftp.componentsource.com/res/pub/media/3/2059/default_w350.png?rc=ni_4107" alt="A document and search results in the dtSearch boolean search sample."/&gt;&lt;p&gt;&lt;small&gt;A document and search results in the dtSearch boolean search sample.&lt;/small&gt;&lt;/p&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Version 7.64 adds dtsListIndexSkipNoiseWords to list words in an index without including any noise words.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The dtSearch product line can instantly search terabytes of text across a desktop, network, Internet or Intranet site. dtSearch products also serve as tools for publishing, with instant text searching, large document collections to Web sites or portable media. Developers can embed dtSearch's instant searching and file format support into their own applications.&lt;/p&gt;&lt;p&gt;The following editions are available:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-desktop-spider/index.html?rc=ni_4107"&gt;dtSearch Desktop with Spider&lt;/a&gt;&lt;br/&gt;Instantly search your desktop, as well as selected Spidered Web sites.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-network-spider/index.html?rc=ni_4107"&gt;dtSearch Network with Spider&lt;/a&gt;&lt;br/&gt;Instantly search the many forms of data that exist across a large enterprise network; Spider adds remote Web sites to a searchable database. &lt;/li&gt;&lt;li&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-publish/index.html?rc=ni_4107"&gt;dtSearch Publish&lt;/a&gt;&lt;br/&gt;Quickly publish instatnly searchable document collections or web site content to portable media (CDs, DVDs, external hard drives, etc.) &lt;/li&gt;&lt;li&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-text-retrieval-engine-win-net/index.html?rc=ni_4107"&gt;dtSearch Text Retrieval Engine for Win and .NET (32-bit / 64-bit)&lt;/a&gt;&lt;br/&gt;Add dtSearch search features and built-in format support to your application; API supports .NET, C++, Java, SQL, etc. .NET Spider API.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-web-spider/index.html?rc=ni_4107"&gt;dtSearch Web with Spider (32-bit / 64-bit)&lt;/a&gt;&lt;br/&gt;Quickly publish a wide variety of file types to a Web site; Spider adds local or remote web sites (including dynamically-generated content) to a sites's searchable database.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Updates in V7.64&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Enhancements (dtSearch Engine)&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Added dtsSearchLanguageAnalyzerSynonyms flag to enable using a language analyzer to generate morphological variations on a search term at search time. When this flag is set, the language analyzer is called for each word or phrase in the search request. The flag dtsLaInputIsSearchTerm is passed to the language analyzer in dtsLaJob.flags, so the language analyzer knows why it is being called.&lt;/li&gt;&lt;li&gt;Added dtssGetWordBreaker API function to provide direct access to the dtSearch Engine's internal word breaker using the language analyzer API. For sample code demonstrating how to use this API, see the WordBreak example in examples\vc8\WordBreak.&lt;/li&gt;&lt;li&gt;Added more structural information to the output generated by conversion to the it_ContentAsXml file format.&lt;/li&gt;&lt;li&gt;Added to COM interface: WordListBuilder.ListFieldValues, WordListBuilder.SetFilter, and IndexJob.EnumerableFields.&lt;/li&gt;&lt;li&gt;Added dtsListIndexSkipNoiseWords flag for ListIndexJob to list words in an index without including any noise words.&lt;/li&gt;&lt;li&gt;Added dtsoFfSkipDataSourceFields flag for Options.FieldFlags to prevent DocFields values from appearing in FileConverter output&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Fixes and minor enhancements&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Fixed incorrect display of CreationDate and ModDate properties in PDF files&lt;/li&gt;&lt;li&gt;Fixed incorrect hit highlighting when Unicode Filtering options at search time different from options used to index a file. To ensure consistent options, Unicode Filtering options are stored in the index when the index is created, in the index_a.ix file.&lt;/li&gt;&lt;li&gt;Fixed error updating index when directory specified for temporary files is inaccessible.&lt;/li&gt;&lt;li&gt;Fixed index merge bug causing &amp;quot;Inconsistent doc ids from target index&amp;quot; error during merge.&lt;/li&gt;&lt;li&gt;Fixed two search report bugs causing incorrect hit highlighting.&lt;/li&gt;&lt;li&gt;Improved formatting of documents converted from Ami Pro and Quattro Pro to HTML&lt;/li&gt;&lt;li&gt;Added automatic detection of gb2312 and JIS encoding.&lt;/li&gt;&lt;li&gt;Added automatic detection of XyWrite, XBase, WordStar 3.x, and WordPerfect 4.2 and TAR files.&lt;/li&gt;&lt;li&gt;Improved reporting of file types by FileConverter.DetectedTypeId, providing much more specific information about Microsoft Word versions and adding type detection for additional file formats&lt;/li&gt;&lt;li&gt;Added support for text extraction from Adobe Framemaker MIF, XFA form templates in PDF files, and Visio XML files&lt;/li&gt;&lt;li&gt;Fixed &amp;quot;Excessive nesting&amp;quot; error indexing OpenOffice document due to bug parsing table structure&lt;/li&gt;&lt;li&gt;Fixed RTF file parser bug affecting handling of the \upr tag&lt;/li&gt;&lt;li&gt;Other file parser bug fixes affecting Multimate, Lotus 1-2-3, PDF, Word, PowerPoint&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;About dtSearch Corp.&lt;/h3&gt;&lt;p&gt;A leading supplier of text retrieval software, &lt;a href="http://www.componentsource.com/features/dtsearch/index.html?rc=ni_4107"&gt;dtSearch Corp.&lt;/a&gt; develops, manufactures and sells the dtSearch text retrieval product line. dtSearch products have been the smart choice for Text Retrieval since 1991. The dtSearch product line is known for its &amp;quot;industrial-strength&amp;quot; (PC Magazine) ability to instantly search terabytes of text. dtSearch product line includes end-user, enterprise and developer text retrieval products. dtSearch product line also includes publishing capabilities, for publishing large document collections to Web sites or CD/DVD and Spidering capabilities, for remote site and distributed searching access. dtSearch products have received multiple awards and hundreds of excellent press reviews. Fortune 500 companies and others with some of the most demanding document search needs in the world rely on dtSearch. 4 out of 5 of Fortune Magazine's most profitable companies have dtSearch developer or multi-user licenses. Typical corporate uses of dtSearch products include general information retrieval, Internet/Intranet site searching and access to technical documentation.&lt;/p&gt;</description><category>32 Bit</category><category>C++Builder</category><category>Delphi</category><category>Dev Tools &amp; IT Utilities</category><category>dtSearch Corp.</category><category>Embarcadero / CodeGear</category><category>Feature Releases</category><category>Internet Explorer</category><category>Microsoft</category><category>Office</category><category>Search</category><category>Visual Basic .NET</category><category>Visual C# .NET</category><category>Visual C++ .NET</category><category>Visual Studio .NET</category><category>Windows 2000</category><category>Windows 9X / ME</category><category>Windows Dev Tools</category><category>Windows NT</category><category>Windows Vista</category><category>Windows XP</category><guid isPermaLink="false">http://www.componentsource.com/news/2010/05/20/dtsearch-publish.html?rc=ni_4107</guid><pubDate>Thu, 20 May 2010 17:52:00 GMT</pubDate></item><item><title>dtSearch updates Language Packs</title><link>http://www.componentsource.com/news/2009/10/21/dtsearch-language-extension-packs.html?rc=ni_1288</link><description>&lt;div class="image"&gt;&lt;img src="http://ftp.componentsource.com/res/pub/media/1/736/default_w350.png?rc=ni_1288" alt="Highlighting search results in dtSearch."/&gt;&lt;p&gt;&lt;small&gt;Highlighting search results in dtSearch.&lt;/small&gt;&lt;/p&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Now includes Bulgarian and Slovenian, plus bilingual French/English and German/English stemming.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.componentsource.com/products/dtsearch-text-retrieval-engine-win-net/index.html?rc=ni_1288"&gt;dtSearch Text Retrieval Engine&lt;/a&gt; and &lt;a href="http://www.componentsource.com/products/dtsearch-web-spider/index.html?rc=ni_1288"&gt;dtSearch Web with Spider&lt;/a&gt; are supplied with stemming rules and a noise-word file for English(US). Stemming is the only search expansion option which is 'on' by default in the dtSearch end-user products; the reason for this is that stemming is almost always useful when making a search, and adds little to the time required to make a search. Unlike some other search engines, dtSearch applies stemming at search time, there is no need to build indexes specifically to apply stemming and no need to build separate indices for each language in use.&lt;/p&gt;&lt;p&gt;With the stemming option selected dtSearch will find plurals and many other variations; for example a search on print will find printers, printing, printed automatically. However, if you are searching documents written in other languages, the English stemming rules will cause you to miss many word variations which do not occur in English (e.g. verb and noun changes with gender), and you may find that words which are unrelated are found in error. Furthermore, the English noise word list, which is designed to remove unwanted English words from your index to keep the index size small, is not suitable for other languages; your indexes may contain many words which will not be useful in searches and which will add to the size of your indexes.&lt;/p&gt;&lt;p&gt;The solution is to use language specific files in place of the default US English files. These are supplied in the form of &lt;a href="http://www.componentsource.com/products/dtsearch-language-extension-packs/index.html?rc=ni_1288"&gt;Language Extension Packs&lt;/a&gt; which contain files for many languages. All files are in Unicode format.&lt;/p&gt;&lt;h3&gt;Updates&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;Bulgarian language available in Eastern European Group extension pack&lt;/li&gt;&lt;li&gt;Slovenian language available in Eastern European Group extension pack&lt;/li&gt;&lt;li&gt;French/English stemming available in Western European Group extension pack&lt;/li&gt;&lt;li&gt;German/English stemming available in Western European Group extension pack&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;About dtSearch Corp.&lt;/h3&gt;&lt;p&gt;A leading supplier of text retrieval software, &lt;a href="http://www.componentsource.com/features/dtsearch/index.html?rc=ni_1288"&gt;dtSearch Corp.&lt;/a&gt; develops, manufactures and sells the dtSearch text retrieval product line. dtSearch products have been the smart choice for Text Retrieval since 1991. The dtSearch product line is known for its &amp;quot;industrial-strength&amp;quot; (PC Magazine) ability to instantly search terabytes of text. dtSearch product line includes end-user, enterprise and developer text retrieval products. dtSearch product line also includes publishing capabilities, for publishing large document collections to Web sites or CD/DVD and Spidering capabilities, for remote site and distributed searching access. dtSearch products have received multiple awards and hundreds of excellent press reviews. Fortune 500 companies and others with some of the most demanding document search needs in the world rely on dtSearch. 4 out of 5 of Fortune Magazine's most profitable companies have dtSearch developer or multi-user licenses. Typical corporate uses of dtSearch products include general information retrieval, Internet/Intranet site searching and access to technical documentation.&lt;/p&gt;</description><category>32 Bit</category><category>Access</category><category>ActiveX .NET Ready</category><category>ActiveX Components</category><category>ActiveX DLL</category><category>ActiveX OCX</category><category>C++ / MFC Class Libraries</category><category>C++Builder</category><category>Components</category><category>Delphi</category><category>Dev Tools &amp; IT Utilities</category><category>DLL</category><category>dtSearch Corp.</category><category>Embarcadero / CodeGear</category><category>Feature Releases</category><category>FrontPage</category><category>Internet Explorer</category><category>Java Class</category><category>Java Components</category><category>JavaBean</category><category>JBuilder</category><category>Microsoft</category><category>Office</category><category>Search</category><category>SQL Server Tools</category><category>Visual Basic</category><category>Visual Basic .NET</category><category>Visual C# .NET</category><category>Visual C++</category><category>Visual C++ .NET</category><category>Visual FoxPro</category><category>Visual Studio</category><category>Visual Studio .NET</category><category>Windows 2000</category><category>Windows 9X / ME</category><category>Windows Dev Tools</category><category>Windows NT</category><category>Windows XP</category><guid isPermaLink="false">http://www.componentsource.com/news/2009/10/21/dtsearch-language-extension-packs.html?rc=ni_1288</guid><pubDate>Wed, 21 Oct 2009 00:00:00 GMT</pubDate></item></channel></rss>
