Image to Text (OCR) — Extract Text from Any Image

Extract text from images using Tesseract.js OCR. Supports 100+ languages, handles photos, screenshots, and scanned documents. Runs entirely in your browser.

Processed locally

Was this tool helpful?

How to Use

Extract text from any image in four steps using Tesseract.js optical character recognition, running entirely in your browser with no server uploads.

Step-by-step guide

Upload your image — Drag and drop a JPEG, PNG, WebP, BMP, GIF, or TIFF image into the drop zone, or click to browse your files. High-resolution images (300 DPI or above) produce the most accurate results. Screenshots, scanned documents, and photographs of printed text all work well.
Select the language — Choose the primary language of the text in your image from the dropdown menu. The OCR engine uses language-specific character models, so matching the correct language significantly improves accuracy. English is selected by default. For images containing multiple languages, select the dominant one.
Toggle image enhancement — The "Enhance image" option is enabled by default and applies grayscale conversion plus contrast boosting. This preprocessing step improves accuracy on low-contrast images, faded documents, and photos taken in poor lighting. Disable it if your image already has crisp black text on a white background.
Click "Extract Text" — The Tesseract.js engine will download language data (cached after first use) and begin recognition. A progress bar shows real-time status. Processing time depends on image size and complexity: a typical screenshot takes 2-5 seconds, while a full-page scanned document may take 10-20 seconds.

Working with results

After extraction completes, the recognized text appears in an editable text area with a confidence score badge. A score of 80% or higher (green) indicates reliable recognition. You can edit the text directly to fix any errors, then copy it to your clipboard with the Copy button or download it as a .txt file.

Tips for best accuracy

Use images at 300 DPI or higher resolution.
Ensure text is horizontal and not rotated. Use the Image Rotate tool first if needed.
Crop images to contain only the text region for faster processing and higher accuracy.
For photos of documents, ensure even lighting without shadows across the text.

About This Tool

How OCR technology works

Optical Character Recognition (OCR) is a technology that converts images of text into machine-readable character data. The process involves multiple stages: image preprocessing (noise reduction, binarization, deskewing), layout analysis (detecting text regions, lines, and word boundaries), character segmentation (isolating individual characters), and finally pattern recognition (matching each character against trained models to produce text output).

About Tesseract.js

This tool uses Tesseract.js, a JavaScript port of Google's Tesseract OCR engine compiled to WebAssembly. Tesseract was originally developed at Hewlett-Packard Labs in the 1980s, open-sourced in 2005, and subsequently maintained by Google. As of 2024, it remains the most widely deployed open-source OCR engine, supporting over 100 languages with LSTM (Long Short-Term Memory) neural network models trained on millions of text samples.

The WebAssembly compilation means the full Tesseract engine runs directly in your browser at near-native speed. Language data files (typically 2-15 MB each) are downloaded once and cached by your browser, so subsequent uses are instant. No server infrastructure is required, and your images remain entirely on your device throughout the process.

Image preprocessing pipeline

When the "Enhance image" option is enabled, this tool applies a two-stage preprocessing pipeline before passing the image to Tesseract. First, the image is converted to grayscale using the standard luminance formula (0.299R + 0.587G + 0.114B), which matches human brightness perception. Second, adaptive contrast enhancement pushes pixel values away from the mean brightness, making dark text darker and light backgrounds lighter. This preprocessing can recover readable text from faded receipts, low-contrast photos, and washed-out scans.

For related image processing tasks, see the EXIF Metadata Viewer for inspecting photo metadata, or the Image Compressor for reducing file sizes before sharing.

Why Use This Tool

Common use cases

Digitizing printed documents — Convert scanned contracts, invoices, receipts, and letters into searchable, editable text. This eliminates manual retyping and creates digital archives that can be indexed and searched. Organizations processing hundreds of paper documents per day save thousands of hours annually with OCR automation.
Extracting text from screenshots — Copy error messages, code snippets, terminal output, or chat transcripts from screenshots when the original text is not selectable. Developers frequently use OCR to extract stack traces from mobile device screenshots or copy configuration values from documentation images.
Accessibility and content repurposing — Make image-based text accessible to screen readers by converting it to selectable text. Social media posts that are text-as-images, infographics with embedded data, and presentation slides can all be converted to formats that assistive technologies can read. This supports WCAG accessibility compliance.
Research and data collection — Extract tabular data from photographed charts, survey forms, and research papers. Researchers working with historical archives, government records, and printed datasets use OCR to build structured databases from physical documents. Combined with the JSON Formatter, extracted data can be structured for programmatic analysis.
Multilingual document processing — Handle documents in over 100 languages without switching tools. International businesses processing invoices, customs forms, and contracts in multiple languages benefit from a single OCR pipeline that adapts its character recognition models to each language. For Japanese, Chinese, and Korean text, specialized LSTM models handle the complexity of thousands of unique characters.

Privacy-first processing

Unlike cloud OCR services from Google, AWS, or Microsoft that upload your images to remote servers for processing, this tool runs the complete Tesseract engine locally in your browser via WebAssembly. Your confidential documents, medical records, financial statements, and personal photos never leave your device. The only network activity is the one-time download of language model files, which are cached by your browser. You can verify this by monitoring the Network tab in your browser's developer tools during processing.

FAQ

What image formats does the OCR tool support?

The tool accepts all common image formats including JPEG, PNG, WebP, BMP, GIF, and TIFF. Any image your browser can display will work. For best OCR accuracy, use high-resolution images (300 DPI or higher) with clear, well-lit text and minimal background noise.

How accurate is the text recognition?

Accuracy depends on image quality, font clarity, and contrast. Clean screenshots and scanned documents at 300+ DPI typically achieve 95-99% accuracy. Handwritten text, stylized fonts, and low-resolution photos may produce lower accuracy. Enabling the 'Enhance image' preprocessing option improves results on low-contrast images by applying adaptive binarization.

Which languages does the OCR support?

The tool supports over 100 languages via Tesseract.js trained data, including English, Spanish, French, German, Chinese (Simplified and Traditional), Japanese, Korean, Arabic, Hindi, Russian, Portuguese, and many more. Select the correct language before processing for optimal accuracy, as the OCR engine uses language-specific character models.

Can I copy and edit the extracted text?

Yes. The extracted text appears in an editable text area where you can select, copy, and modify the content. Use the copy button for one-click clipboard copying, or download the text as a .txt file. The text area is fully editable, so you can correct any OCR errors before copying or downloading.

Is my image uploaded to a server for processing?

No. All OCR processing runs entirely in your browser using the Tesseract.js WebAssembly engine. Your images never leave your device. The only network request is the initial download of the Tesseract language data file (2-15 MB depending on language), which is cached by your browser for subsequent uses.