Cette page a été archivée et est n'est plus actualisée.

Nous ne fournissons plus ce produit.

PDFlib TET PDF IFilter

Extrayez le texte et les métadonnées des documents PDF.

Publié par PDFlib
Distribué par ComponentSource depuis 2003

Version : 5.5 Mise à jour : Jan 12, 2024

i

Please note: PDFlib TET PDF IFilter was officially retired as of December 19th 2024. If you are interested in this product, consider PDFlib instead.

PDFlib TET PDF IFilter Releases

Released: Jan 12, 2024

Mises à jour de 5.5

Fonctionnalités

  • Security and performance updates of third-party components.
  • Enhanced all language bindings and updates for the latest language versions including Microsoft .NET 8, PHP 8.3, Perl 5.38 and Ruby 3.2.

Released: Dec 20, 2022

Mises à jour de 5.4

Fonctionnalités

  • Security and performance updates of third-party components.
  • Enhanced all language bindings and updated the latest language versions including Microsoft .NET 6/7, PHP 8.1/8.2, Perl 5.34/5.36 and Ruby 3.1.
  • Added support for ARM64/x86_64 bindings on Apple macOS.
  • Improved TIKA and MediaWiki connectors.

Correctifs

  • Minor bug fixes and improvements.

Released: Nov 19, 2021

Mises à jour de 5.3 (maintenance release)

Fonctionnalités

  • Added support for Microsoft Windows 11.

Released: May 4, 2021

Mises à jour de 5.3

Fonctionnalités

  • Optimized PDF resource handling to improve performance for documents with excessive numbers of images, patterns or other resources.
  • Security and performance updates of all third-party components.
  • Harden processing of damaged and illegal PDF documents by testing the full Issue Tracker PDF corpus with tens of thousands of stressful PDF files.
  • Expanded platform and CPU support including macOS on ARM64 and Linux on ARM64.
  • Timeout can be specified to limit processing time for large...

Released: Jul 25, 2019

Mises à jour de 5.2

Fonctionnalités

  • Added support for SharePoint Server 2019.
  • Improved table detection with row and column span identification.
  • Mark Artifacts (irrelevant text and images) in TETML and the API.
  • Extract text and images from annotations and patterns.
  • Support for inline images and images in soft masks.
  • Security updates for third-party libraries.
  • Optionally retrieve Separation and DeviceN text colors in the simpler alternate color space instead of the rather complex native color space.

Released: Jun 1, 2017

Mises à jour de 5.1

Fonctionnalités

  • Text in bookmarks, annotations (comments) and form fields are now indexed by default.
  • Additional controls for indexing metadata properties.
  • Extended samples for custom metadata properties.
  • Streamlined output in the Windows event log.
  • Auxiliary tool for querying Windows metadata properties.
  • Additional predefined properties (XMP and PDF-based).

 

Released: Nov 9, 2015

Updates in this release

Updates in PDFlib TET 5

  • Retrieve fill and stroke color of text.
  • Improved page and table layout recognition.
  • Support vertical font metrics for CJK text.
  • Significantly enhanced merging of fragmented images.
  • Extract image masks and soft masks.
  • Merge and convert JPEG 2000-compressed images.
  • Preserve named spot colors in extracted TIFF images.
  • Honor layers and clipping paths.
  • Check whether an area on the page is empty, e.g. before placing a stamp or barcode.
  • TET's XML output called TETML includes...

Released: Aug 19, 2010

Updates in this release

Updates in V4.0

New features in PDFlib TET 4.0:

  • Performance enhancements: faster for many classes of documents
  • Higher speed and smaller memory consumption for very large documents up to hundreds of thousands of pages
  • Extract right-to-left and bidirectional text for Arabic, Hebrew, etc.
  • Unicode post-processing:
  • Foldings preserve, remove or replace characters
  • Decompositions replace a character with an equivalent sequence, e.g. replace narrow or vertical Japanese characters with their standard...