How to Implement Optical Character Recognition in Python
Introduction
Optical Character Recognition is one of the important factors in the Python programming language. There a lot of applications in the world with these types of concepts. Today in this tutorial, we will have a complete overview of the Optical Character Recognition.
How to create an Optical Character Recognition in Python programming language?
Let’s make use of the “pytesseract” to create a class. This class helps to ingress photos and scan them. You can also make use of the extensions named “ocr.py” to process the output file. The “processor_image” function block is used for text sharpening. The view function and route handler are added to the app.py applications.
Let’s check out the Router Handler code and OCR Engine code below.
Router Handler Code:
//ROUTE HANDLER @app.route('/v{}/ocr'.format(_VERSION), methods=["POST"]) def ocr(): try: url = request.json['image_url'] if 'jpg' in url: output = process_image(url) return jsonify({"output": output}) else: return jsonify({"error": "only .jpg files, please"}) except: return jsonify( {"error": "Did you mean to send: {'image_url': 'some_jpeg_url'}"} )
OCR Engine Code:
// OCR ENGINE import pytesseract import requests from PIL import Image from PIL import ImageFilter from StringIO import StringIO def process_image(url): image = _get_image(url) image.filter(ImageFilter.SHARPEN) return pytesseract.image_to_string(image) def _get_image(url): return Image.open(StringIO(requests.get(url).content))
You should add an API version number as well as update the imports.
import os import logging from logging import Formatter, FileHandler from flask import Flask, request, jsonify from ocr import process_image _VERSION = 1 # API version
In this, I’m adding “process_image(),” one of the OCR Engine functions in JSON response. JSON is used to collect data that is entering in and out of the API. We make use of the “image” library from PIL so that it’s easy to pass the response in the object file and then install them.
The above code will suit perfectly for .jpg images only. In case of using any complex library that can feature different formats in images, then in this part, every image can be effectively and easily processed. If you are interested in writing this code on your own, then you need to make sure whether you have installed PIL.
You need to start this by running “app.py,” the applications.
// $ cd ../home/flask_server/ $ python app.py //
Now choose another terminal and then run the following
//$ curl -X POST http://localhost:5000/v1/ocr -d '{"image_url": "some_url"} '-H "Content-Type: application/json"
Let’s consider an example,
// $ curl -X POST http://localhost:5000/v1/ocr -d '{" C:UsersvivDownloadsPic1 ": "<a href="https://besanttechnologies.com/images/blog_images/ocr/ocr.jpg"> https://besanttechnologies.com/images/blog_images/ocr/ocr.jpg</a>"}' -H "Content-Type: application/json" { "output": "ABCDEnFGH I JnKLMNOnPQRST" } //
Let me explain this with the image.
Input Image: Following is the input image that should be converted to the digital text.
Output Image: Following is the output we receive.
Applications of Optical Character Recognition
There are many applications that Optical Character Recognition is used to. Here is one example.
Ticket counter makes use of the Optical Character Recognition for detection and scanning the important data on the ticket to identify the commuter detail as well as routes. Conversion of digital formats from the paper text where the cameral clicks high-resolution images and then the Optical Character Recognition help to bring them into a PDF or word format.
The OCR introduction with Python is endorsed to the addition of “Orcad” and “Tesseract,” which are the powerful, versatile libraries. This library enables every developer and coder to make the code design easier and enable them to invest their more time on other important factors of their projects. Apart from this, there are plenty of applications that make use of Optical Character Recognition.
Now let’s check another example to implementing the Optical Character Recognition in Python in depth.
How to read PDF content using OCR in Python
Python provides different libraries to convert PDF to text format. Let’s look at the process in detail.The primary goal of converting PDF to text is, we need to convert the PDF pages to images, and we should make use of the Optical Code Recognition to read the image content and then store it as a file (text format).
We need to following installations.
- pip3 install PIL
- pip3 install pytesseract
- pip3 install pdf2image
- sudo apt-get install tesseract-ocr
We can deal with this program using two important processes.
Process 1:
The first part deals with the conversion of PDF to images. Every PDF page is now made to store as an image file. Let’s store the name of the images as
PDF page 1→ pg_1jpg
PDF page2 → pg_2.jpg
……
Pdf page n → pg_n.jpg
Here is the implementation of process 1
# Import libraries from PIL import Image import pytesseract import sys from pdf2image import convert_from_path import os # Path of the pdf PDF_file = "f.pdf"
Converting PDF to images
pages = convert_from_path(PDF_file, 500) #store the PDF page in a variable. image_counter = 1 #counter to store every PDF page to image. # Iterate through all the pages stored above for page in pages: # PDF page 1 -> pg_1.jpg # PDF page 2 -> pg_2.jpg # PDF page 3 -> pg_3.jpg # .... # PDF page n -> pg_n.jpg filename = "page_"+str(image_counter)+".jpg" page.save(filename, 'JPEG') image_counter = image_counter + 1 #incrementing the counter so that filename can be updated.
Process 2:
The second part deals with identifying the text from the converted image file and them storing the information as a text file. In this part, we are going to process those images and then convert them to text. We will be able to do different text processing once we have text as a string variable.
For example,
You are writing a line, and let’s say you are not able to complete a word in one line. In this case, you will make use of the hyphen (_) so that the word is read as a continuous text.
For example
I am a programmer with a good knowledge of different programming languages. I am ready to face any interview.
For this kind of word, we will do a pre-processing. We will make the new line and hyphen into a complete word. Once the preprocessing is completed, the text will be stored in a different text file. If you need to get the input PDF files that are used in the code, you need to click f.pdf
Here is the implementation of the second part.
# Import libraries from PIL import Image import pytesseract import sys from pdf2image import convert_from_path import os # Path of the pdf PDF_file = "f.pdf"
Identifying text from the images using OCR
3 filelimit = image_counter-1 #this is the variable to get all pages count. outfile = "out_text.txt" #, we create a text file order to deliver the output. f = open(outfile, "a") #we are going to open every file in append so that every image content is added to a similar file. for i in range(1, filelimit + 1): #we are gonna iteration to total pages from one. # Set filename to recognize text from # Again, these files will be: # pg_1.jpg # pg_2.jpg # .... # pg_n.jpg filename = "page_"+str(i)+".jpg" text = str(((pytesseract.image_to_string(Image.open(filename))))) text = text.replace('-\n', '') f.write(text) #writing the processed text to a text file. f.close() #closing the file after completing writing every text.
Input file:
Output file
Benefits and Drawbacks of OCR Engine:
Handwriting recognition is one of the important applications of making use of the OCR in Python. It’s also used to convert PDF to text, and also stores those values as variables. When it comes to the drawbacks, they are not assured of 100% accuracy. In some cases, when using the AI concepts, there are chances for the OCR to result in poor images. Handwriting images results differs based on the different aspects like page color, image contrast, writing style, and image resolution.I hope you are clear about implementing the optical character recognition in Python. If you have any queries, let us know in the comment section below.
Related Blogs:
- Brief Overview of Python Language
- Python Career opportunities
- Python Break Continue
- Python Control Flow
- Python Data Types
- Python Dictionary
- Python Exception Handling
- Python File
- Python Functions
- Python Substring