About Aspose.OCR for Python via Java

Optical character recognition API for Python.

Aspose.OCR for Python via Java integrates powerful optical character recognition (OCR) capabilities into your cross-platform Python notebooks and applications. With the intuitive and high-speed API, you can extract text from scans, screenshots, web links, or smartphone photos, delivering results that are ready for consolidation, analysis, or storage. Recognize scanned images, smartphone photos, screenshots, and scanned PDFs, saving results in popular document formats. Advanced pre-processing filters handle rotated, skewed, and noisy images. Optimize performance by offloading tasks to the GPU.

Supported File Formats

Images

  • PDF
  • JPEG
  • PNG
  • TIFF
  • GIF
  • Bitmap

Batch OCR

  • Multi-page PDF
  • ZIP
  • Folder

Recognition results

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Features and Capabilities

  • Photo OCR - Extract text from smartphone photos with scan-level accuracy.
  • Searchable PDF - Convert any scan into a fully searchable and indexable document.
  • URL recognition - Recognize an image from URL without downloading it locally.
  • Bulk recognition - Read all images from multi-page documents, folders and archives.
  • Any font and style - Identify and recognize text in all popular typefaces and styles.
  • Fine-tune recognition - Adjust every OCR parameter for best recognition results.
  • Spell checker - Improve results by automatically correcting misspelled words.
  • Find text in images - Search for text or regular expression within a set of images.
  • Compare image texts - Compare texts on two images, regardless of the case and layout.