First Steps with OpenCV for Python

Feature image

Whether you want to build a complex deep learning model for a self-driving car, a live face recognition program, or making your image processing software for your graduate project, you will have to learn OpenCV along the way.

OpenCV is a huge image and video processing library designed to work with many languages such as python, C/C++, Java, and more. It is so popular and powerful that it is the foundation for many of the applications you know that deal with image processing.

Getting started with OpenCV can be challenging, primarily if you rely on its official documentation, which is known for being cumbersome and hard to understand.

Today we will learn how to work with OpenCV, and I’ll do my best to keep it simple.


Install OpenCV

Now it’s the time to install OpenCV using only one simple command:

pip3 install opencv-python

Importing a simple image

The first thing you will need to learn is importing a simple image and displaying it using OpenCV.

The code is straightforward:

import cv2

# Read the image
img = cv2.imread("image.jpg")

# Display the image
cv2.imshow("Image", img)

# Wait for a keypress
cv2.waitKey(0)

# Clean up
cv2.destroyAllWindows()
Reading our first image

Reading our first image

After reading the code, if you think that we are doing more than just loading the image, you are right. After all, loading an image with OpenCV goes down to only one line of python code:

img = cv2.imread("image.jpg")

So, what about the rest? Well… the first thing we have to do is to import the library. Only then we can read the image using the imread method and pass the image’s path as the only parameter.

If we stop the program now, we would have loaded the image but done nothing with it, so instead, let’s at least present the image into a new window so the user can see the result. For that, we will use cv2.imshow and passing the window name and the image as arguments.

Lastly, we tell Python not to exit the program until we press a key or close the window. Then we clean everything up by destroying all windows we opened.


Loading videos

OpenCV is great at dealing not only with images but also with videos. The video streams can be loaded from a video file or directly from a video source such as a webcam.

In the next example, we will load a video from the webcam and present it on a new window:

import cv2

# Load the video stream
video = cv2.VideoCapture(0)

while(True):
   # Capture each frame as an image
   ret, frame = video.read()

   # show the image on the screen
   cv2.imshow('frame', frame)
     
   # Stop the playback when pressing ‘q’
   if cv2.waitKey(1) == ord('q'):
       Break

# Release the video from memory
video.release() 

# Clean up
cv2.destroyAllWindows()

The code is self-explanatory, but let’s review it in detail. We use the method VideoCapture to load the video resource. The first argument defines what input we are reading. Passing a 0, we are referring to the main webcam (if existent). In case you have multiple webcams connect, you can use 1, 2, etc. If your video is captured and saved in a file, you can pass a string with the path to the file. Next, we start a loop that will only end on user command, but more on that later. What’s important here is what happens inside the loop. The first thing we are doing is asking our VideoCapture to read a frame of the video. In the case of the camera, it will be a snapshot of the camera at that time, and in the case of a video file, it will be the current video frame.

Each frame we read from a video load in the same way as an image is crucial because it means that we have the entire arsenal of OpenCV functions at our disposal when dealing with videos.

For example, the frame’s output captured with read can be passed to the method imshow exactly as we did in the previous example of working with images.

Beautiful!

Now the video is playing, but there’s no way out of the while loop, so let’s build an exit strategy by detecting if the q key has been pressed. If it is, then we exit the loop for the cleanup activities.

We have an additional step for clean up, which releases the camera or video file because even if we are not reading any more frames, we still have the objects open in memory. We can do that by using the method release from the VideoCapture object.


Resizing images

Changing image sizes has a wide range of applications, from optimizing for sizes, zooming or even feeding a neural network to perform some magic. If resizing an image is what you want, OpenCV got you covered.

Let’s now see an example of how to resize an image:

import cv2
img = cv2.imread("image.jpg")
scale = 60
width = int(img.shape[1] * scale / 100)
height = int(img.shape[0] * scale / 100)
dim = (width, height)
resized_img = cv2.resize(img, dim, cv2.INTER_AREA)
cv2.imshow("Resized_Image", resized_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Resizing images

Resizing images

It’s pretty simple, so we added some flavor to it, and instead of simply resizing the image to a specific size, we are scaling down the image by a factor of X (60% for the example). Note that the code will be simpler if we are targeting specific dimensions.

The method resize expects at least two arguments, the image to be resized and the new dimensions (in x and y as a tuple). Optionally we can pass the third argument to define the interpolation as described on the resize function docs .


Switching color spaces

When we read an image with OpenCV we think of colors as channels or the depth of the image array where each channel or dimension corresponds to a color. The most common color space and the one you probably already know is RGB, consisting of 3 channels: red, green, and blue. But other systems can represent a color on an image, like LAB, YCrCb, HLS, and HSV, among others. Each of them with different characteristics worth studying and learning.

A more popular option for setting colors on an image is grayscale, where only one channel defines each pixel. Let’s see an example of how we can transform a color image into a greyscale one.

import cv2
img = cv2.imread("image.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Gray", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()
Grayscale image

Grayscale image

The function where all the magic happens is cvtColor, which expects two arguments, the image and the color space, and returns the new image without altering the original. Fortunately, OpenCV has defined values for each known color space transformation. In our case, we use COLOR_BGR2GRAY, which transforms BGR to GRAY.

So what’s BGR? It’s the default way that OpenCV loads images.


Saving images

We often need to save the image result after processing it, maybe after changing its color space, making image transformation, or whatever operation we do on that image.

The following code shows you how to save an image after changing its color to grayscale:

import cv2
img = cv2.imread("image.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Gray", gray)
cv2.imwrite("image_gray.jpg", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

You can see that we’ve used a function called imwrite responsible for saving the gray image in our computer after specifying the path for the newly saved image. The actual image that we need to save is the variable gray in this case.


Image smoothing

OpenCV offers tools to smooth an image and help reduce the noise in it. The process is fully automated, and all the complexity of how it works is encapsulated for us on a single, simple to use function.

Smoothing an image intends to improve its quality, though not perfect, in some scenarios, it can be a substantial change and key for using the image on further processes.

Here is an example of how to smooth an image:

import cv2
img = cv2.imread("early_1800.jpg")
blur = cv2.blur(img, (5, 5))
cv2.imshow("Blur", blur)
cv2.waitKey(0)
cv2.destroyAllWindows()
Original Image

Original Image

Smoothed Image

Smoothed Image

The image looks much better, but how does it work? Through the method blur on the OpenCV library, which expects the image and the kernel size as arguments. The kernel size being a tuple to reflect the x and y axis. Note that different values of x and y will result in different outputs, so you will have to play around with those values for your images.

The kernel size works by taking a small pixel area (5x5 in our case), taking the average value of those pixels, and replacing the real one (pixel) to get the new little noisy image.

There are other ways to smooth an image using for example, gaussianBlur or medianBlur that works similarly.


Drawing On Images

So far, we have been playing with the images without adding anything new to them. It’s time we change that. OpenCV allows us not only to perform transformations and effects to the images but to change them or draw on them.

Drawing on images can be useful if, for example, you are trying to make an object tracking program or face recognition program where you want to draw a square or shape to highlight the identified objects.

Let’s draw a few geometric shapes on images to show how it works.

Drawing a Line

We will try to draw a line on an image using the line function:

import cv2
img = cv2.imread("image.jpg")
line = cv2.line(img, (20, 20), (150, 150), (255, 0, 0), 5)
cv2.imshow("Line", line)
cv2.waitKey(0)
cv2.destroyAllWindows()
Drawing a line

Drawing a line

The line function expects the image and four more arguments: the start of the line in (x1, y1), the end of the line in (x2, y2), the line’s color (in BGR for our image), and its thickness in pixels.

Drawing a Rectangle

I think rectangles are the most used shape, at least in the AI world, as they are commonly used to track objects like faces, cars, or traffic signs on the images. They are also super easy to use. Here is an example:

import cv2 as cv
img = cv2.imread("image.jpg")
rectangle = cv2.rectangle(img, (200, 200), (450, 450), (255, 0, 0), 5)
cv2.imshow("Rectangle", rectangle)
cv2.waitKey(0)
cv2.destroyAllWindows()
Drawing a rectangle

Drawing a rectangle

The rectangle function is very similar to the line function. It expects the image and four more arguments: the top-left corner of the rectangle in (x1, y1), the bottom-right of it in (x2, y2), the line’s color (in BGR for our image), and its thickness in pixels. ###. Drawing a Circle The last thing we will draw is a small circle on the image which is sometimes useful when you are tracking a circle object like a ball.

import cv2
img = cv2.imread("image.jpg")
circle = cv2.circle(img, (300, 300), 50, (255, 0, 0), 5)
cv2.imshow("Circle", circle)
cv2.waitKey(0)
cv2.destroyAllWindows()
Drawing a circle

Drawing a circle

Again all these functions are pretty similar. To render a circle on an image, we use the circle function that expects the image and four more arguments: the center point of the circle in (x, y), the radius in pixels, the color of it, and the line thickness.

Conclusion

OpenCV is an exciting and powerful library for dealing with images and videos. Its wide range of uses goes from a simple helper library to perform image manipulations to implement state-of-the-art computer vision algorithms.

Today we covered just a small percentage of what this library is capable of. If you enjoy this reading, I recommend checking my article on essential OpenCV functions , which will extend your knowledge of the library a bit more.

Computer vision is a topic that fascinates me, and I’ll write more about OpenCV in the future, so stay tuned.

Thanks for reading!