dtSearch Language Extension Packs - Summary

by dtSearch Corp. - Product Type: Component / Application / ActiveX OCX / ActiveX DLL / DLL / VC++ Class Library / JavaBean / Java Class

Summary

dtSearch Language Extension Packs by dtSearch Corp.

Multi lingual searching. dtSearch Engine/Web is supplied with stemming rules and a noise-word file for English(US). If you are searching documents written in other languages then this could mean that plurals and noise words are missed. ElectronArt Language Extension Packs are available for Eastern and Western European languages to improve your non English(US) search results.

Language Extension Pack series 400 (ElectronArt )

For dtSearch Text Retrieval Engine or Web 6.5 or later.

dtSearch Engine/Web is supplied with stemming rules and a noise-word file for English(US). Stemming is the only search expansion option which is 'on' by default in the dtSearch end-user products; the reason for this is that stemming is almost always useful when making a search, and adds little to the time required to make a search. Unlike some other search engines, dtSearch applies stemming at search time, there is no need to build indexes specifically to apply stemming and no need to build separate indices for each language in use.

The problem

With the stemming option selected dtSearch will find plurals and many other variations; for example a search on print will find printers, printing, printed automatically. However, if you are searching documents written in other languages, the English stemming rules will cause you to miss many word variations which do not occur in English (e.g. verb and noun changes with gender), and you may find that words which are unrelated are found in error.

Furthermore, the English noise word list, which is designed to remove unwanted English words from your index to keep the index size small, is not suitable for other languages; your indexes may contain many words which will not be useful in searches and which will add to the size of your indexes.

The solution

Use language specific files in place of the default US English files. These are supplied in the form of Language Extension Packs which contain files for many languages, see list below. All files are in Unicode format.

Language Extension Packs 400 Series

Western European Group (Lep402)

Danish

Dutch

English

Finnish

French*

German*

Italian

Norwegian

Portuguese

Spanish

Swedish

* LEP400 and LEP402 also include unique bi-lingual French/English and German/English stemming and noise word files which enables search expansion on indexes and documents containing a mix of French/German and English text.

Eastern European Group (Lep403)

Belurusian

New Bulgarian

Czech

Estontian

Greek

Hungarian

Latvian

Lithuanian

Polish

Russian

Slovak

New Slovenian

Turkish

Ukrainian

Language Packs include:

Stemming rule files and noise word files for each supported language

Test files to check the operation of stemming in all the supplied languages.

Stemming Language Selector application, changes stemming rules from the Windows Start menu.

One year of on-line technical support and updates.

PartNumbers: PC-515281-70484 515281-70484 PC-515281-70482 515281-70482 PC-515281-70483 515281-70483

Publisher PartNumbers: LEP402

PurchaseOptions: dtSearch Language Extension Pack Western + Eastern European (LEP400) 1 Server License , dtSearch Language Extension Packs Western European (LEP402) 1 Server License , dtSearch Language Extension Packs Eastern European (LEP403) 1 Server License

Resources:

Operating System for Deployment: Windows XP, Windows ME, Windows 2000, Windows 98, Windows NT 4.0, Windows 95

Architecture of Product: 32Bit

Product Type: Component, Application

Component Type: ActiveX OCX, ActiveX DLL, DLL, VC++ Class Library, JavaBean, Java Class

General: Internet Enhanced, Supports Apartment Model Threading, Supports Free Threading, Microsoft Transaction Server Compatible (MTS)

Compatible Containers: Microsoft Visual Studio .NET, Microsoft Visual Studio 6.0, Microsoft Visual Studio 97, Microsoft Visual Basic .NET, Microsoft Visual Basic 6.0, Microsoft Visual Basic 5.0, Microsoft Visual C++ .NET, Microsoft Visual C++ 6.0, Microsoft Visual C++ 5.0, Microsoft Visual C# .NET, Microsoft Visual InterDev 6.0, Microsoft Visual InterDev 1.0, Microsoft Visual FoxPro 6.0, Microsoft Office 2000, Microsoft Office 97, Microsoft Access 2000, Microsoft Access 97, Microsoft SQL Server 7.0, Microsoft SQL Server 6.5, Microsoft Outlook, Microsoft Internet Information Server 5.0, Microsoft Internet Information Server 4.0, Microsoft FrontPage, Microsoft Internet Explorer 5.0, Microsoft Internet Explorer 4.0, CodeGear C++ 5.0 (formerly Borland), CodeGear C++ (formerly Borland), C++Builder 5, C++Builder 4, C++Builder 3, Delphi 5.0, Delphi 4.0, Delphi 3.0, JBuilder 4, JBuilder 3.5, JBuilder 3, JBuilder 2, JBuilder 1, .NET Framework 1.0

Search Items: New Product July 04, New Product Aug 04, New Product Mar 05

Keywords: Search searching searches dt search dtsearch Professional Partner

Product Search

Enter search words: