PDFlib TET improves Language Binding Support

New version adds support for PHP 5.6, Perl 5.20, Python 3.4, Ruby 2.1 and 2.2.
March 2, 2015
Nuova Funzionalità

PDFlib TET (Text Extraction Toolkit) reliably extracts text, images and metadata from any PDF file. It is available as a component or as a command-line tool. It reads text contents as Unicode strings or structured XML and includes detailed glyph and font information. With PDFlib TET you can retrieve the corresponding Unicode values for text in a PDF document, as well as its position on the page.

Updates in 4.4

  • Language binding support for PHP 5.6, Perl 5.20, Python 3.4, Ruby 2.1 and 2.2.
  • Image extraction
    • Correct color output for certain CMYK JPEG images.
    • Enhanced merging of fragmented images.
    • Enhanced filtering of small images.
    • Reduced memory requirements.
  • Enhanced dropcap detection.
  • Workaround for malformed fonts and PDFs.
  • Improved results for text in right-to-left languages.

About PDFlib

Munich-based PDFlib, founded in 2000, develops and sells leading edge components for server-centric generation and processing of PDF documents. PDFlib customers use the software for automated and high volume generation and processing of PDF documents in business and prepress workflows or for online billing systems. PDFlib GmbH sells worldwide with main markets in North America, Germany and Japan.

Extracted text from a PDF using PDFlib TET.


Toolkit di estrazione del testo.

Hai una domanda?

Chatta live con i nostri specialisti di gestione delle licenze di PDFlib ora.