Covers almost any document-related need

Robert Gransson5 estrela

I highly recommend the Aspose product family. It covers almost any need and feature one can have related to documents The company seems to be delivering constant updates and keeping in time with the rest of the technology world (by looking at their twitter feed and newsletters). But the biggest point is the time savings for my project, having just one set of APIs to relate to, makes this a money saver for me.

We needed to read common document formats (PDF, DOCX), extract text from them, save & recompress them as PDFs. (This includes OCR of images, possibly extracted from PDFs) We had to do this to an initial batch of about 200.000 - 500.000 docs, then at a more moderate pace. Three components (Aspose.Words, Aspose.PDF, and Aspose.OCR) would cover the needs, but buying the entire package costs just slightly more than those products separately, and the additional features adds value to my project (spreadsheets, email, imaging, barcode).

Setting up the Aspose products is as easy as: 1) acquiring a license; 2) include the dependencies in your project; and 3) instantiate a component license class and set the license file (which can be done with a memorystream if you don't use file-storage). Aspose has a great documentation of the products that covers the entire API, and common use cases with examples.

The OCR results have been satisfying, we have compared it to Google's open source project Tesseract, and on regular-text-sized documents it gives us good results. Compared to the overhead of having a separate component of doing the OCR, and the "unknown" of using projects that possibly can be abandoned, I prefer the Aspose product.

Finally, Aspose.PDF can handle so-called *iref-streams*. It’s a feature of pdf documents above version 1.4 and is a MUST if you need to read PDFs produced in modern times. (It certainly was a "must have" feature for us!) Many competing products don’t offer this. It’s very easy to glance over requirements or take them for granted. For example, if a component can read document format "X" it does NOT mean that it can write format X, or convert from format X to Y. This is a problem that I haven't found with Aspose products.