PDFlib TET updated

Giugno 10, 2014 - 10:01
Rilascio di patch

PDFlib TET (Text Extraction Toolkit) is software for reliably extracting text from any PDF file. It is available as a component or as a command-line tool. It reads text contents as Unicode strings or structured XML and includes detailed glyph and font information. With PDFlib TET you can retrieve the corresponding Unicode values for text in a PDF document, as well as its position on the page.

Updates in 4.3

  • Support for TIFF image resolution information.
  • Workarounds for various kinds of malformed PDFs.
  • Enhanced robustness against non-conforming input.
  • Improved word boundary detection for certain glyph spacings.

About PDFlib

Munich-based PDFlib, founded in 2000, develops and sells leading edge components for server-centric generation and processing of PDF documents. PDFlib customers use the software for automated and high volume generation and processing of PDF documents in business and prepress workflows or for online billing systems. PDFlib GmbH sells worldwide with main markets in North America, Germany and Japan.

Extracted text from a PDF using PDFlib TET.

Hai una domanda?

Chatta live con i nostri specialisti di gestione delle licenze di PDFlib ora.