Essential OpenCV Functions to Get You Started into Computer Vision
Learn about common OpenCV functions, and their applications
Computer Vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. As such many projects involve the usage of images from cameras and videos and the use of several techniques such as image processing and deep learning models.
OpenCV is a library designed to solve common computer vision problems, it’s super popular among those in the field and it’s great for learning and using in production. The library has interfaces for multiple languages, including Python, Java, and C++.
Throughout this article we will cover different (common) functions inside OpenCV, their applications, and how you can get started with each one. Even though I’ll be providing the examples in Python, the concepts and the functions will be the same for the different supported languages.
What exactly are we going to learn today?
- Reading, writing and displaying images
- Changing color spaces
- Resizing images
- Image rotation
- Edge Detection
Reading, writing and displaying images
Before we can do anything with computer vision, we need to be able to read and understand how images are processed by computers. The only information computers can process is binary information (0 and 1), this includes text, images, and video.
How do computers work with images
To understand how a computer “understands” an image yo can picture a matrix of the size of the image where on each cell you assign a value that represents the color of the image in that position.
Let’s take an example with an image in greyscale:
For this particular case, we can assign each block (or pixel) in the image a numeric value (which can be interpreted as binary). This numeric value can be from any range, though it’s a convention to use 0 for black, 255 for white, and all the integers in between to represent the intensity.
When we work with color images, things can get a bit different depending on the library and how we choose to represent the colors. We will talk more about that later in the post, however, they all share more or less the same idea, which is using different channels to represent the colors, being RGB (red, green, and blue) one of the most popular options. With RGB we need 3 channels to build each pixel, so our 2d matrix now is a 3d matrix with a depth of 3, where each channel is the intensity of a particular color, and when mixing we get the final color for the pixel.
Working with images using OpenCV
Let’s now jump into the code to perform 3 of the most important functions when dealing with images, reading , showing, and saving .
import cv2 import matplotlib.pyplot as plt # Reading the image image = cv2.imread('sample1.jpg') # Showing the image plt.imshow(image) plt.show() # Saving the image cv2.imwrite('sample1_output.jpg', image)
If you run our code, now you will get one image saved to disk, and another as a result of the plot.
The image on the left is the one we plotted, vs the one on the right which is the image saved to disk. The difference in sizes aside (due to the plot), the image on the left side looks weird, looks bluish, but why is it different? (by the way, the image on the right is correct).
The reason why the image on the left looks with strange colors has to do with how OpenCV read images by default. OpenCV
imread() function will read an image using BGR channels as supposed to RGB which is used by the plot function. This is normal for OpenCV, and there are ways to fix it which we will discuss next.
Changing color spaces
What is color space? In our previous example, we saw how computers are processing images, and we saw that to represent colors we need to use channels, which when combined we get the final color of the image. The configuration in which these channels is set are color spaces. Unknowingly we have already covered 2 different color spaces in our previous code snippet, we used RGB and BGR, but there are more which have very particular and interesting properties. Some other popular color spaces include LAB, YCrCb, HLS, and HSV. Since each color space has its own properties, some algorithms or techniques may work better in one space than others, so changing an image between these color spaces is important, and thankfully, OpenCV provides us with a very easy to use function for exactly this purpose.
Meet cvtColor , and let’s see how we can use it to fix our plot above
import cv2 import matplotlib.pyplot as plt # Reading the image image = cv2.imread('sample1.jpg') # Change color space image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Showing the image plt.imshow(image) plt.show()
And we now get a beautiful brown dog:
Let’s explore some other color spaces:
import cv2 import matplotlib.pyplot as plt # Reading the image original = cv2.imread('sample1.jpg') fig = plt.figure(figsize=(8, 2)) axarr = fig.subplots(1, 3) # Change color space image = cv2.cvtColor(original, cv2.COLOR_BGR2RGB) axarr.imshow(image) image = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY) axarr.imshow(image) image = cv2.cvtColor(original, cv2.COLOR_BGR2LAB) axarr.imshow(image) plt.show()
Now that we are able to load, show, and change the color space for images, the next thing we need to focus on is resizing. Resizing images in computer vision is important, because, the learning models in ML work with fixed-sized input. The size will depend on the model, but to make sure our images will work on the model we would need to resize them accordingly.
OpenCV offers a practical method for doing that called resize , let’s see an example of how to use it.
import cv2 # Reading the image original = cv2.imread('sample1.jpg') # Resize resized = cv2.resize(original, (200, 200)) print(original.shape) print(resized.shape)
(1100, 1650, 3) (200, 200, 3)
A crucial part of training models is the dataset we will use to train it. If the dataset doesn’t have enough samples, and well-distributed samples, the trained model is likely to fail. But sometimes we don’t count with a big enough dataset, or we don’t have all the situations we want to train the model into, so we run processes that alter the images we have to generate new ones. There are many scenarios where rotating the image to various angles can help us gain efficiency in our model, but we won’t cover all of them here. Instead, I’d like to show you how to use OpenCV to rotate images.
Let’s see an example of the rotate function in OpenCV
import cv2 import matplotlib.pyplot as plt # Reading the image image = cv2.imread('sample1.jpg') image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE) plt.imshow(image) plt.show()
Even though this method is super easy to use, it also restricts us to a few options, we can’t rotate in any angle we want. To have more control over the rotation we can use getRotationMatrix2D and warpAffine instead.
import cv2 import matplotlib.pyplot as plt # Reading the image image = cv2.imread('sample1.jpg') image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) rows, cols = image.shape[:2] deg = 45 # (col/2,rows/2) is the center of rotation for the image # M is the coordinates of the center M = cv2.getRotationMatrix2D((cols/2, rows/2), deg, 1) image = cv2.warpAffine(image, M, (cols, rows)) plt.imshow(image) plt.show()
Edges are the points in an image where the image brightness changes sharply or has discontinuities. Such discontinuities generally correspond to:
- Discontinuities in depth
- Discontinuities in surface orientation
- Changes in material properties
- Variations in scene illumination
Edges are a very useful feature of an image that can be used as part of a ML pipeline, we already saw some examples on how edges can help us detect shapes or lines on a road .
CV2 provides us with the Canny function for this task, and here is how to use it:
import cv2 import matplotlib.pyplot as plt # Reading the image original = cv2.imread('sample2.jpg') fig = plt.figure(figsize=(6, 2)) axarr = fig.subplots(1, 2) axarr.imshow(cv2.cvtColor(original, cv2.COLOR_BGR2RGB)) threshold1 = 50 threshold2 = 200 grey = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY) image = cv2.Canny(original, threshold1, threshold2) axarr.imshow(image, cmap='gray') plt.show()
OpenCV is a great library for working with images and videos, it provides a ton of useful tools and functions for dealing from the most simple to the more complex scenarios. The functions we reviewed today are just a few from the gallery. If you are interested to explore the library docs, look at the samples, there’s a lot, from simple image handling like transposing, to more advance features like contour detection, feature detection, and even face detection.
I hope you enjoy reading through this article, and please join the conversation for what are your favorite OpenCV functions.
Thanks for reading!
Join more than a thousand developers!
Subscribe now to our free, weekly e-mail with the best new articles, courses, and special bonuses.
We won't send you spam. Unsubscribe at any time.