PDF News

Programmatically Extract Text from a PDF using C#

May 20, 2024
Use MESCIUS Document Solutions for PDF to automate the extraction of text from PDFs for indexing, searching, and more.

Document Solutions for PDF (DsPdf) is a high-speed, feature-rich, server-side PDF API Library for .NET with no dependencies on Adobe Acrobat. DsPdf allows developers to programmatically create, manipulate, import/export, and deploy PDF documents, including AcroForms, across desktop and web applications at scale. With full .NET support, you can generate, load, modify, and convert PDFs directly within your .NET, Mono, Xamarin.iOS, and Xamarin.Android apps. It also includes a fast JavaScript-based client-side viewer/editor that allows users to view/optionally edit PDF documents in desktop/web applications.

In this blog post, MESCIUS Product Marketing Specialist Mackenzie Albitz demonstrates how to use DsPdf to unlock PDF content seamlessly by parsing and extracting text from PDFs for a variety of scenarios, including:

  • Extracting all text from a PDF file
  • Extracting text from a specific PDF page
  • Extracting text from predefined bounds in a PDF
  • Extracting fonts from a PDF 

Sample code is included and there's even a link to a Quick Start Demo and a complete sample application to assist you in getting started.

Read the full blog to learn how to unlock PDF content.

Document Solutions for PDF is licensed per developer and is available in several license options for differing distribution needs. Team licenses are also available for multiple developers within the same organization. See our Document Solutions for PDF licensing page for full details.

Learn more on our Document Solutions for PDF product page.

Help Users Find Information with PDF Search

May 17, 2024
Choose a PDF component with built-in search support to enhance accessibility, increase engagement, and improve user experience.

Incorporating a PDF component with search functionality into your application offers significant advantages. Users can locate specific information within complex documents with ease, streamlining their experience and enhancing productivity. This translates to a more user-friendly and efficient app, reducing frustration and allowing users to find what they need quickly, without the needing to install additional software.

Several WPF PDF Viewer controls allow you to search for text in PDF files including:

  • DevExpress WPF PDF Viewer enables in-application text search for PDFs, highlighting matches and navigating for efficient information retrieval.
  • Telerik UI PDF for WPF can effortlessly search PDFs for keywords and phrases, highlighting results for quick information retrieval.
  • PDFView4NET WPF Edition by O2 Solutions facilitates searching for text within PDF documents, offering both user-driven search bars and programmatic control for integration.
  • Syncfusion WPF PDF provides text search functionality including highlighting matches and programmatic search options.

For an in-depth analysis of features and price, visit our WPF PDF Viewer controls comparison.

Compare WPF PDF Viewer Controls

Flatten PDFs for Seamless Sharing and Printing

May 17, 2024
Aspose.PDF for .NET 24.5 adds the ability to flatten a layered PDF file ensuring consistent presentation and prevents unintended modifications.

Aspose.PDF for .NET empowers developers to seamlessly integrate PDF generation, manipulation, and conversion functionalities within their applications. It eliminates the dependency on Adobe Acrobat by providing a programmatic interface for working with PDF documents, allowing for tasks like creating PDFs from scratch, editing existing ones, adding elements like images and watermarks, and managing security features.

Aspose.PDF for .NET V24.5 adds the ability to flatten a layered PDF file, streamlining document sharing and archiving by converting editable layers into a static, unalterable format. This ensures consistent presentation and prevents unintended modifications.

To see a full list of what's new in V24.5, see our release notes.

Aspose.PDF for .NET is offered as Developer Small Business, Developer OEM, Site Small Business, and Site OEM licenses catering to a range of business needs. Licenses are perpetual, and include 1 year of support and maintenance. Subscription renewals are also available. See our Aspose.PDF for .NET licensing page for full details.

Aspose.PDF for .NET is available in the following products:

Add Polyline Annotations to PDFs in .NET Apps

May 16, 2024
Highlight specific areas, connect elements visually, or trace paths within a PDF document to promote better communication and clarity for readers.

Polyline annotations in PDFs are interactive elements that enable the creation of free-form shapes defined by connected straight line segments. This functionality offers a versatile tool for reviewers or editors to precisely highlight specific areas, connect elements visually, or trace paths within the PDF document, promoting better communication and clarity for readers. They are particularly beneficial for creating dynamic and interactive documents as they help in embedding more complex graphical data, which can improve the usability and readability of PDFs in technical fields such as engineering, architecture, and design.

Several .NET PDF components allow you to move, resize, remove or edit the appearance of polylines in PDF files, including:

  • DevExpress PDF Document API (part of DevExpress Office File API) lets you add and manage interactive polylines for highlighting or connecting elements in PDFs.
  • Document Solutions for PDF by MESCIUS offers robust polyline drawing tools to improve visual data representation and document interaction.
  • Aspose.PDF for .NET empowers developers to create and manipulate polyline annotations within PDF documents using C#.
  • Syncfusion WPF PDF (part of Syncfusion Essential Studio Enterprise) allows developers to implement polylines with advanced graphical control within user-friendly interfaces.

For an in-depth analysis of features and price, visit our comparison of .NET PDF components.

Compare .NET PDF Components

Easily Remove Sensitive Information from PDFs

May 14, 2024
IronPDF for .NET 2024.5.2 lets you sanitize and scan PDF files, improving the safeguarding of user data and enhancing application security.

IronPDF for .NET empowers developers with a user-friendly C# library to generate, edit, and manage PDFs. It leverages a familiar HTML/CSS foundation for effortless PDF creation, while also offering robust features like text extraction, OCR, signing, and more. This comprehensive solution simplifies complex PDF development tasks, saving time and boosting productivity for .NET projects.

The IronPDF for .NET 2024.5.2 update introduces the IronPdf.Cleaner API, designed to improve the security and reliability of PDF handling in applications. This new API offers developers the capability to sanitize and scan PDF files, effectively mitigating the risks associated with processing content from an untrusted source. It removes potentially malicious code and sensitive information, such as metadata, comments, or embedded objects, that could compromise the application or user data. This enhances application security and safeguards against potential vulnerabilities introduced through untrusted PDF sources.

To see a full list of what's new in 2024.5.2, see our release notes.

IronPDF for .NET is licensed based on the number of developers, organization locations and projects, and is available as a Perpetual license with one free year of product updates and support services. See our IronPDF for .NET licensing page for full details.

IronPDF for .NET is available to buy in the following products: