This page has been archived and is no longer updated.

We do not supply this product anymore.

PDFlib pCOS

PDF Information Retrieval Tool.

Published by PDFlib
Distributed by ComponentSource since 2003

i

PDFlib pCOS 4 reached its end-of-life in August 2018. Customers who accessed the pCOS programming interface via the pCOS product can continue this use, but should be aware of the fact that there will be no updates nor maintenance in the future.

It is recommended to switch to PDFlib PLOP 5 as successor of PDFlib pCOS 4.

Customers who wish to keep their pCOS application up to date can switch to one...

Show more

PDFlib pCOS 4 reached its end-of-life in August 2018. Customers who accessed the pCOS programming interface via the pCOS product can continue this use, but should be aware of the fact that there will be no updates nor maintenance in the future.

It is recommended to switch to PDFlib PLOP 5 as successor of PDFlib pCOS 4.

Customers who wish to keep their pCOS application up to date can switch to one of the alternative products that includes the pCOS programming interface, see list below:

  • PDFlib+PDI
  • PDFlib Personalization Server (PPS)
  • PDFlib TET
  • PDFlib PLOP
  • PDFlib PLOP DS
  • PDFlib TET PDF IFilter.


The core pCOS functionality offered by the pCOS interface within other products will be fully supported in the future.

Features of PDFlib pCOS

PDFlib pCOS Features

Supported Input
PDFlib pCOS supports all relevant flavors of PDF input:

  • All PDF versions up to Acrobat XI, including ISO 32000
  • Encrypted documents (password may be required)
  • Damaged PDF input documents will be repaired if possible


Information Retrieval
PDFlib pCOS offers a simple query interface, without the need for low-level parser programming. With PDFlib pCOS you can extract a variety of interesting items, such as:

  • Document info entries and XMP metadata
  • General information: linearization and tagged PDF status, encryption details and permission settings, number of pages and fonts
  • Fonts with name, embedding status, etc.
  • Image data, such as bit depth, color space, compression, XMP
  • Color space details
  • Target URLs and coordinates of Web links
  • Bookmarks and the corresponding page numbers, e.g. to create a table of contents
  • Form field data: full field names, contents, position, etc.
  • Page size, CropBox, page rotation
  • Status of ISO standards: PDF/X, PDF/A, PDF/UA, PDF/E, and PDF/VT
  • Geospatial reference information
  • List or extract file attachments
  • Layer names, page labels, article threads
  • Annotation details
  • List all comments along with the reviewer's name
  • Digital signature details: name of signature field(s), signed/unsigned, name of signer, date and reason of signature
  • Extract ICC output intent profiles from PDF/X or PDF/A documents
  • Block properties for PDFlib Personalization Server
  • JavaScript on document, page, annotation, or field level
  • Retrieve XML invoice data from ZUGFeRD documents
  • Properties of PDF Packages/Portfolios


Output Formats
PDFlib pCOS can create output for different purposes:

  • Plain text output
  • Tabular output for processing with a spreadsheet/database
  • Binary data for reuse, e.g. ICC profiles or file attachments
  • Unicode text output in UTF-8 or UTF-16 formats
  • User-defined output formats for custom post-processing


pCOS Paths – Simple Syntax for PDF Objects
Instead of getting bogged down by complex tree structures, e.g. for bookmarks or form fields, you can easily access PDF objects by using the simple pCOS path syntax. It offers convenient shortcuts for accessing commonly used PDF objects, such as pages, fonts, bookmarks, form fields etc.

pCOS Library or Command-Line Tool?
pCOS is available as a programming library (component) for various development environments, and as a command-line tool for batch operations. Both offer similar features, but are suitable for different deployment tasks.

The pCOS programming library is used for integration into desktop or server applications. Examples for using the library with all supported language bindings are included in the pCOS package. A variety of additional examples is available in the pCOS Cookbook on the PDFlib Web site.

The pCOS command-line tool is suited for batch processing PDF documents. It doesn’t require any programming, but offers powerful command-line options which can be used to integrate it into complex workflows. The pCOS command-line tool extends the features of the library:

  • Simple retrieval of common PDF elements, such as bookmarks, annotations, metadata, form fields, etc.
  • Extended mode for querying more complex objects and customizing the output format
  • Extract data items, such as file attachments, ICC profiles, etc.
  • Emit information as comma-separated values or a userdefined format for import into a spreadsheet or database
  • Recursion feature for dumping composite PDF objects, such as dictionaries and arrays


Supported Development Environments
PDFlib pCOS is everywhere – it runs on practically all computing platforms. We offer variants for all common flavors of Windows, Mac OS, Linux and Unix.
The pCOS core is written in highly optimized C code for maximum performance and small overhead. Via a simple API (Application Programming Interface) the pCOS functionality is accessible from a variety of development environments:

  • COM for use with VB, ASP,  and many other languages
  • C and C++
  • Java, including servlets and Java Application Server
  • .NET for use with C#, VB.NET, ASP.NET, etc.
  • Perl
  • PHP