GroupDocs.Text for .NET V16.11.0

新 - 使用高级的文件文本提取 API 从不同的文档格式提取原始和格式化文本。
11月 28, 2016
新产品

特性

GroupDocs.Text for .NET is a document text extraction API. It extracts text and metadata from Microsoft Word, Excel, PowerPoint, email messages, container files that contain other files like ZIP archives, plain text files and HTML without any document readers installed. The text extractor API performs operations with accuracy and speed. It also provides tools to detect encoding such as UTF32 LE, UTF32 BE, UTF16 LE , UTF16 BE and more.

  • Advanced Document Text Extraction API Features
    • Extract raw and formatted text.
    • Extract metadata.
    • Extract text from containers containing other files such as zip archives.
    • Extract formatted text from TXT, Markdown and HTML files.
    • Support for encoding detection.
    • Support for media type detectors.
  • Text and Metadata Extractors - GroupDocs.Text provides various metadata and text extractors for different files.
  • Container Text Extractor - Work with files that contain other documents like zip archives.
  • Supported Formats
    • DOCX : OOXML Document.
    • DOCM : OOXML Macro Enabled Document.
    • DOC : Word Document 97-2003.
    • RTF : Rich Text Format.
    • ODT : OpenDocument Text.
    • XLSX : OOXML 2007-2010.
    • XLSM : OOXML Macro Enabled Workbook.
    • XLSB : OOXML Binary Workbook.
    • XLS : Excel Workbook 97-2003.
    • CSV : Comma Separated Values.
    • ODS : OpenDocument Spreadsheet.
    • PPTX : OOXML Presentation.
    • PPSX : OOXML SlideShow.
    • PPSM : OOXML Macros Enabled Presentation.
    • PPT : PowerPoint Presentation 97-2003.
    • PPS : PowerPoint SlideShow 97-2003.
    • ODP : OpenDocument Presentation.
    • TXT : Plain text.
    • HTML (.xhtml, .htm) : Hypertext Markup Language document.
    • MHTML (.mht) : Web Archive Single File.
GroupDocs.Text for .NET

GroupDocs.Parser for .NET

从不同的格式中提取原始和格式化文本。

有任何疑问吗?

透过Live Chat与我们的GroupDocs 专家联络!