Extract ROI from image with Python and OpenCV

Last modified date

Comments: 0

Extracting a ROI (Region of Interest) using OpenCV and Python is not so hard as it could may sound.

 

Source image:

 

So, we begin to import our modules and the source image:

 

import cv2 
import numpy as np 

#import image 
image = cv2.imread('C:\\Users\\Link\\Desktop\\image.png')

 

We are going to do some simple image manipulation: turn the image to grayscale, binarize and dilate using custom kernels.

Note: about dilate, which seems the same as second image (binarized one) we will talk later.

 

#grayscale 
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) 
cv2.imshow('gray', gray) 
cv2.waitKey(0) 

#binary 
ret,thresh = cv2.threshold(gray,127,255,cv2.THRESH_BINARY_INV) 
cv2.imshow('second', thresh) 
cv2.waitKey(0) 

#dilation 
kernel = np.ones((1,1), np.uint8) 
img_dilation = cv2.dilate(thresh, kernel, iterations=1) 
cv2.imshow('dilated', img_dilation) 
cv2.waitKey(0)

 

process

 

By modifying kernel values you can change the purpose of this program, from select only one ROI to a full line of text.

 

#find contours 
im2,ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) 

#sort contours 
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])

 

With these lines we will go to find and then sort (from left to right and up to down) the ROI.

 

for i, ctr in enumerate(sorted_ctrs): 
    # Get bounding box 
    x, y, w, h = cv2.boundingRect(ctr) 
    
    # Getting ROI 
    roi = image[y:y+h, x:x+w] 

    # show ROI 
    #cv2.imshow('segment no:'+str(i),roi) 
    cv2.rectangle(image,(x,y),( x + w, y + h ),(0,255,0),2) 
    #cv2.waitKey(0) 
    if w > 15 and h > 15: 
        cv2.imwrite('C:\\Users\\Link\\Desktop\\output\\{}.png'.format(i), roi)

 

Alright, these is the core of all. We are going to do a for loop which will save every box the program will find.

At line 13 there is a condition: if the box it detect have width and height bigger than 15 pixels, then save the image. This will prevent to save debris (with handwriting text it’s extremely useful). You can also see a way to iterate the saving method: string format. DigitalOcean have a great article on that, take a look. Finally, all the images will be saved to the ouptut folder on Desktop. Be careful on path’s.

 

cv2.imshow('marked areas',image) 
cv2.waitKey(0)

 

Let’s see the final result…

 

roi_final

 

By modifying kernel values…

 

kernel = np.ones((10,15), np.uint8)

 

…you can obtain different output.

 

dilated

 

Kernel values are based on Y and X axis. First will dilate your image on Y axis (up-down) and second on X (left-right). This will take the full word inside the box, included the points on “i” in Shine and diamond.

 

roi_final2

Next is the full block of code.

import cv2
import numpy as np

# import image
image = cv2.imread('C:\\Users\\PC\\Desktop\\roi.png')

# grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow('gray', gray)

# binary
ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('threshold', thresh)

# dilation
kernel = np.ones((10, 1), np.uint8)
img_dilation = cv2.dilate(thresh, kernel, iterations=1)
cv2.imshow('dilated', img_dilation)

# find contours
im2, ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])

for i, ctr in enumerate(sorted_ctrs):
    # Get bounding box
    x, y, w, h = cv2.boundingRect(ctr)

    # Getting ROI
    roi = image[y:y + h, x:x + w]

    # show ROI
    # cv2.imshow('segment no:'+str(i),roi)
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

    if w > 15 and h > 15:
        cv2.imwrite('C:\\Users\\PC\\Desktop\\output\\{}.png'.format(i), roi)

cv2.imshow('marked areas', image)
cv2.waitKey(0)

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment