About Aspose.OCR for .NET

Add OCR/OMR functionality to your .NET applications.

Aspose.OCR for .NET is a robust optical character recognition API for adding OCR functionality to applications. The API is extensible, easy to use, compact and provides a simple set of classes for controlling character recognition. It supports commonly used image formats and provides functionalities like reading characters and fonts from images, bold and italic styles, noise removal filters, scanning of the whole image or any part of the image and much more.

Supported File Formats

Images

  • JPEG
  • PNG
  • TIFF
  • BMP
  • GIF

Batch OCR

  • Multi-page PDF
  • DjVu
  • ZIP
  • Folder

Recognition results

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Advanced .NET OCR API Features

  • Photo OCR - Extract text from smartphone photos with scan-level accuracy.
  • Searchable PDF - Convert any scan into a fully searchable and indexable document.
  • URL recognition - Recognize an image from URL without downloading it locally.
  • Bulk recognition - Read all images from multi-page documents, folders and archives.
  • Any font and style - Identify and recognize text in all popular typefaces and styles.
  • Fine-tune recognition - Adjust every OCR parameter for best recognition results.
  • Spell checker - Improve results by automatically correcting misspelled words.
  • Find text in images - Search for text or regular expression within a set of images.
  • Compare image texts - Compare texts on two images, regardless of the case and layout.
  • 140+ recognition languages - Aspose's .NET OCR library is a universal solution for document processing, data extraction, and content digitization on a global scale. You can recognize documents written in mixed languages, such as Chinese/English, Arabic/French or Cyrillic/English. The following languages are supported:
    • Extended Latin: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more.
    • Cyrillic alphabet: Russian, Ukrainian, Kazakh, Bulgarian, including mixed Cyrillic/English texts.
    • Arabic, Persian, Urdu, including texts mixed with English.
    • Chinese, Korean, Japanese, Devanagari, and Dravidian languages, including Hindi, Tamil, Marathi, and others.