Big Faceless PDF Library v2.29.5

Released: Apr 9, 2026

Actualizaciones en v2.29.5

Características

  • Made optimizations so that, when profiling or working with the document structure, the memory footprint for large documents is down about 20%.
  • Made several improvements to the HtmlDerivation class, including some adjustments to the way options are specified in the API. Figures are rasterized to images, ActualText is used from Elements.
  • ICC profiles used for PDF/A conversion no longer require B2A tables.
  • Made some subtle tweaks to the tag hierarchy rules allowed in PDF/UA-1, and to the exact meaning of the rules over table header "Scope" after cross-testing and discussion with other validators.
  • Added the ability to allow a LayoutBox to be drawn twice within a tagged document.
  • Adjusted PageExtractor to improve merging of superscript/subscript text with the main row of text, rather than splitting into two rows.
  • Updated Arlington model to the latest release (as of 2026-08).
  • Viewer: Whenever a PropertyChangeEvent is fired, the "propagation id" property of that event is now the AWT event that triggered the change. See the new PDF.setAWTPropertyChangeCause static method for the mechanism.

Correcciones

  • Fixed issue regarding digitally signing linearized documents in the previous release.
  • Fixed issue when a single glyph in a font has multiple Unicode values, it was sometimes writing the wrong value to the ToUnicode map, causing later extraction of the text to give the wrong values.
  • When using the LayoutBox to place mixed right-to-left and left-to-right text on the page, text is now written to the page in logical order so text extraction now works properly even for untagged content.
  • When extracting Arabic text from a PDF, convert presentation forms back to nominal forms. Extraction is now correct for most tagged content like PDF/UA, although there may be regressions for non-tagged content where text is rendered in reverse order.
  • Fixed issue for PDFs generated by PDFKit that incorrectly share the same PDF object node for both the AF array and an EmbeddedFile name tree node; this is invalid, but when one was repaired it broke the other. Now splits on repair.
  • Fixed issue for PDFs generated by PDFReactor that place "null" in the IDTree.
  • Fixed issue where Arlington model would sometimes repeat a warning twice for the same object if it is reachable by multiple paths through the PDF.
  • Fixed issue where the recovery of a damaged file containing multiple XRef streams was failing in some situations.