Build a Face Swapping App, Part 2: API

This artile is the first of 3 articles where we are going to build a mobile app that will automatically perform face swapping.
- Part 1: Computer vision algorithm
- Part 2: Rest API
- Part 3: Mobile app - Pending
In our previous study, Face Swapping using Landmark Detection in OpenCV and Dlib, we developed a computer vision-based model that can be used to swap faces between two input images.
The concept of modern computer vision accomplishing such tasks is astounding but the development of such models is not the only end goal for software engineers and computer scientists. It would be quite inconvenient for people not having any background in programming to run all that code just to have model swap faces of the two given images.
Most people would probably be confused having to look at all the programming jargon they do not even understand. Hence, the model is to be converted into an application that is easy to use for everyone.
The creation of such a user-friendly app that can be used by anyone is based on the efficient connection of our developed model and the front-end application.
An Application Programming Interface (API) is responsible for this efficient and secure connection of the developed model and the front-end application. APIs are essential to every application whether it is a static application or dynamic, that is whether an application needs any kind of processing or is just fetching data for visualization.
This article will go into the basics of APIs and how they work. Then we will use the Flask framework to build an API that can be used to host our face-swapping model developed in the previous article.
What Is an API?
API stands for Application Programming Interface. It is a conjunction of applications, that is it allows multiple applications to talk to each other.In other words, it interfaces the front-end application with the back-end programming and/or server.
Every time you use any kind of application, for example, Facebook or YouTube, you are using an API. It is basically a programming code that allows the transmission of data between two software products.

API Architecture
APIs can be of the following types when we are talking about the systems according to which they are designed:
Database APIs
Enable an application to communicate with a database management system. It can be used to change tables, write data, write queries, etc.
Remote APIs
Provide standard methods for applications running of different machines to interact with each other, for example, a software application accessing the resources that are present outside the machine or device requesting the resources. As the applications connecting with each other are remotely located, having a single communication channel that is the internet, the APIs are based on some kind of web standards.
Operating systems APIs
Provide a set of rules about how a specific application would go about using the services and the resources of operating systems. Each operating system has its unique API, for example, Linux has APIs called kernel-user space API and kernel internal API. Web APIs is the most common class of APIs. These provide the transfer of functionality and data between web-based systems that represent client-server architecture.
Web APIs
Use the Hypertext Transfer Protocol (HTTP) to deliver requests from web applications. The use of Web APIs to process the functionalities of apps is also pretty common in the developer’s community.
Other
There are many other types of APIs, nowadays even things like your car have APIs to control steering, speed, etc.
APIs are basically meant to standardize data exchange between different web services. Standardization refers to enabling diverse systems, running on different operating systems, and/or different technologies to be able to communicate with each other easily. This gives rise to standard API protocols. Some of the standard API protocols are as follows:
- Service Object Access Protocol (SOAP)
- Remote procedure Protocol (RPC)
- Representational State Transfer
- gRPC
- GraphQL
We are not going to look into the details of these protocols, that is a discussion for another article. If you want to learn more about these protocols, visit here .
How Do APIs Work?
Just like the usual user interface is meant to be used by a human, APIs are made to be used by an application. An API rests between the web server and the application. The user of the application initiates a call, otherwise known as the API call, which tells the application to perform a task. In response to this call, the application uses an API to ask the server to perform a task. API is like the middleman of the server and the application.
The role of an API can be understood better by an analogy or an abstraction. Suppose you are a customer at a restaurant. Your waiter can be thought of as an API that performs as an intermediary party between customers (users) and the kitchen (server). You order something by telling your waiter what you want, this is an API call. Then, the waiter requests the order from the kitchen and provides you with the thing you ordered.

Here we can think of the waiter as the abstraction of the API. Abstraction means a cover to make it seem like the operations performed are simple and focus on details having higher importance. In the same way, an API can be thought of as an abstraction of the web server. The application makes an API call to tell the API to retrieve some data and displays the data to the end-user.
The application does not need to know the complexities of the server and how the server works. It just orders a service and receives the data. It just needs to know how to request the API to perform a task.
How Can We Build an API?
There are many ways to build an API. It generally depends on what kind of API you want to develop. A developer first needs to define the requirements for the API, what kind of functionalities the API needs.
Once you have enlisted your requirements, next you will need to consider the API design. How will you go about designing the API? Would you design the front-end first or the backend?
After finalizing the outline requirements of the API, we can move on to the development of our API. We start by defining the operations of the API. We also specify the models to describe the request and response messages.
Next, we define and implement applicable security policies, ensure rate-limiting, proper caching, and other behaviors. The easiest way to develop APIs is to use already available tools.
In this study, we will specifically talk about the tools available to us in Python for APIs., more specifically REST APIs because they are the most well-known and used APIs throughout the world.
The following tools are available in Python which can be used to create an API:
What Is Flask?
Flask is a python web framework that enables you to create and develop web applications very easily. It is basically a micro framework that does not include ORM (Object Relational Manager).
The main thing about flask is that it is written in python, so it is extremely easy for people who are already well-versed in python to write code in it and create a web app using flask. Flask framework is based on WSGI, Werkzeuge toolkit and jinja2 template engine.
WSGI, acronym for Web Server Gateway Interface, is the standard protocol for python-based web applications development. It is a common interface between both the web servers and web applications.
Werkzeuge is a WSGI based toolkit that implements the utility, response objects and request functions. It enables a web frame to be built on it.
Jinja2 is a well-known template engine specifically for python. A dynamic web page is generated by combining a web template with a specific data source using a web template system.
Unlike Django, which is also an extremely popular and one of the most used python web frameworks, Flask is more pythonic and has a small learning curve. This lets people proficient in python learn it faster than other frameworks. It is very explicit and has better readability. Even though it is a micro framework, it still allows building of complex web applications using the integration of multiple python files.
Creating an API in Flask for Face Swapping Model
Now, we move forward towards the coding part. In this tutorial we are going to build the API layer, and a small script in python to call and use this API. This script will later be replaced by a mobile app, on the part 3 of this series.
Before we move forward, make sure you have already read the first part, in where we explain the computer vision algorithm that is the basis for this API.
API implementation
Before anything, we need to install dependencies
Let’s get started with the API by installing some dependencies
pip install Dlib
pip install opencv-contrib-python
pip install Flask Pillow matplotlib
You will also need to download the pre-trained face detection model from here for the computer vision face swapping model to run. This is a step we also had in part 1, if you have that running, you can copy the model from there as well.
Let’s now start writing a small API by importing all necessary libraries.
from flask import Flask, request, Response
import numpy as np
import cv2
import base64
from base64 import encodebytes
import io
from PIL import Image
import matplotlib.pyplot as plt
import dlib
Next, le’s start implementing the Flask server, note that we will take a simplistic approach here, plese check out my article on the best practices for setting up a Flask API to build it right for you.
# Initialize the Flask application
app = Flask(__name__)
Now, we initialize a flask web app. This initializes a web app on the local server of your computer.
# route http posts to this method
@app.route('/api/face_swap', methods=['POST'])
Now we move to the app route where we will deploy our face swapping model. We use the ‘POST’ HTTP method which is used to send HTML form data to the server. The data received by the POST method is not cached by the server.
def face_swap():
r = request
# decode image
face = base64.b64decode(r.json.get('face'))
face = Image.open(io.BytesIO(face))
body = base64.b64decode(r.json.get('body'))
body = Image.open(io.BytesIO(body))
# convert image to numpy array for processing
face = np.array(face)
body = np.array(body)
Next, we start the test method by first receiving any request sent by a client. Here we already know that we will be receiving an object having two base64 encoded images from the client.
So, we extract the two images from the json object and decode both of them into two images, that is face and body. These two images will be used by the face swapping model which will swap the face of the image ‘body’ with the face of the image ‘face.’
Then, we convert both the images into numpy arrays so that they can be processed by the OpenCV library. After this we will insert our whole face swapping model still inside the test() method.
# fancy image processing here....
face_gray = cv2.cvtColor(face, cv2.COLOR_BGR2GRAY)
body_gray = cv2.cvtColor(body, cv2.COLOR_BGR2GRAY)
# Create empty matrices in the images' shapes
height, width = face_gray.shape
mask = np.zeros((height, width), np.uint8)
height, width, channels = body.shape
# Loading models and predictors of the dlib library to detect landmarks in both faces
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("./shape_predictor_68_face_landmarks.dat")
# Getting landmarks for the face that will be swapped into to the body
rect = detector(face_gray)[0]
# This creates a with 68 pairs of integer values — these values are the (x, y)-coordinates of the facial structures
landmarks = predictor(face_gray, rect)
landmarks_points = []
def get_landmarks(landmarks, landmarks_points):
for n in range(68):
x = landmarks.part(n).x
y = landmarks.part(n).y
landmarks_points.append((x, y))
get_landmarks(landmarks, landmarks_points)
points = np.array(landmarks_points, np.int32)
convexhull = cv2.convexHull(points)
face_cp = face.copy()
face_image_1 = cv2.bitwise_and(face, face, mask=mask)
rect = cv2.boundingRect(convexhull)
subdiv = cv2.Subdiv2D(rect) # Creates an instance of Subdiv2D
subdiv.insert(landmarks_points) # Insert points into subdiv
triangles = subdiv.getTriangleList()
triangles = np.array(triangles, dtype=np.int32)
indexes_triangles = []
face_cp = face.copy()
def get_index(arr):
index = 0
if arr[0]:
index = arr[0][0]
return index
for triangle in triangles :
# Gets the vertex of the triangle
pt1 = (triangle[0], triangle[1])
pt2 = (triangle[2], triangle[3])
pt3 = (triangle[4], triangle[5])
# Draws a line for each side of the triangle
cv2.line(face_cp, pt1, pt2, (255, 255, 255), 3, 0)
cv2.line(face_cp, pt2, pt3, (255, 255, 255), 3, 0)
cv2.line(face_cp, pt3, pt1, (255, 255, 255), 3, 0)
index_pt1 = np.where((points == pt1).all(axis=1))
index_pt1 = get_index(index_pt1)
index_pt2 = np.where((points == pt2).all(axis=1))
index_pt2 = get_index(index_pt2)
index_pt3 = np.where((points == pt3).all(axis=1))
index_pt3 = get_index(index_pt3)
# Saves coordinates if the triangle exists and has 3 vertices
if index_pt1 is not None and index_pt2 is not None and index_pt3 is not None:
vertices = [index_pt1, index_pt2, index_pt3]
indexes_triangles.append(vertices)
# Getting landmarks for the face that will have the first one swapped into
rect2 = detector(body_gray)[0]
# This creates a with 68 pairs of integer values — these values are the (x, y)-coordinates of the facial structures
landmarks_2 = predictor(body_gray, rect2)
landmarks_points2 = []
# Uses the function declared previously to get a list of the landmark coordinates
get_landmarks(landmarks_2, landmarks_points2)
# Generates a convex hull for the second person
points2 = np.array(landmarks_points2, np.int32)
convexhull2 = cv2.convexHull(points2)
body_cp = body.copy()
lines_space_new_face = np.zeros((height, width, channels), np.uint8)
body_new_face = np.zeros((height, width, channels), np.uint8)
height, width = face_gray.shape
lines_space_mask = np.zeros((height, width), np.uint8)
for triangle in indexes_triangles:
# Coordinates of the first person's delaunay triangles
pt1 = landmarks_points[triangle[0]]
pt2 = landmarks_points[triangle[1]]
pt3 = landmarks_points[triangle[2]]
# Gets the delaunay triangles
(x, y, widht, height) = cv2.boundingRect(np.array([pt1, pt2, pt3], np.int32))
cropped_triangle = face[y: y+height, x: x+widht]
cropped_mask = np.zeros((height, widht), np.uint8)
# Fills triangle to generate the mask
points = np.array([[pt1[0]-x, pt1[1]-y], [pt2[0]-x, pt2[1]-y], [pt3[0]-x, pt3[1]-y]], np.int32)
cv2.fillConvexPoly(cropped_mask, points, 255)
# Draws lines for the triangles
cv2.line(lines_space_mask, pt1, pt2, 255)
cv2.line(lines_space_mask, pt2, pt3, 255)
cv2.line(lines_space_mask, pt1, pt3, 255)
lines_space = cv2.bitwise_and(face, face, mask=lines_space_mask)
# Calculates the delaunay triangles of the second person's face
# Coordinates of the first person's delaunay triangles
pt1 = landmarks_points2[triangle[0]]
pt2 = landmarks_points2[triangle[1]]
pt3 = landmarks_points2[triangle[2]]
# Gets the delaunay triangles
(x, y, widht, height) = cv2.boundingRect(np.array([pt1, pt2, pt3], np.int32))
cropped_mask2 = np.zeros((height,widht), np.uint8)
# Fills triangle to generate the mask
points2 = np.array([[pt1[0]-x, pt1[1]-y], [pt2[0]-x, pt2[1]-y], [pt3[0]-x, pt3[1]-y]], np.int32)
cv2.fillConvexPoly(cropped_mask2, points2, 255)
# Deforms the triangles to fit the subject's face : https://docs.opencv.org/3.4/d4/d61/tutorial_warp_affine.html
points = np.float32(points)
points2 = np.float32(points2)
M = cv2.getAffineTransform(points, points2) # Warps the content of the first triangle to fit in the second one
dist_triangle = cv2.warpAffine(cropped_triangle, M, (widht, height))
dist_triangle = cv2.bitwise_and(dist_triangle, dist_triangle, mask=cropped_mask2)
# Joins all the distorted triangles to make the face mask to fit in the second person's features
body_new_face_rect_area = body_new_face[y: y+height, x: x+widht]
body_new_face_rect_area_gray = cv2.cvtColor(body_new_face_rect_area, cv2.COLOR_BGR2GRAY)
# Creates a mask
masked_triangle = cv2.threshold(body_new_face_rect_area_gray, 1, 255, cv2.THRESH_BINARY_INV)
dist_triangle = cv2.bitwise_and(dist_triangle, dist_triangle, mask=masked_triangle[1])
# Adds the piece to the face mask
body_new_face_rect_area = cv2.add(body_new_face_rect_area, dist_triangle)
body_new_face[y: y+height, x: x+widht] = body_new_face_rect_area
body_face_mask = np.zeros_like(body_gray)
body_head_mask = cv2.fillConvexPoly(body_face_mask, convexhull2, 255)
body_face_mask = cv2.bitwise_not(body_head_mask)
body_maskless = cv2.bitwise_and(body, body, mask=body_face_mask)
result = cv2.add(body_maskless, body_new_face)
# Gets the center of the face for the body
(x, y, widht, height) = cv2.boundingRect(convexhull2)
center_face2 = (int((x+x+widht)/2), int((y+y+height)/2))
new = cv2.seamlessClone(result, body, body_head_mask, center_face2, cv2.NORMAL_CLONE)
# done fancy processing....
# build a response to send back to client
new = Image.fromarray(new)
byte_arr = io.BytesIO()
new.save(byte_arr, format='PNG') # convert the PIL image to byte array
encoded_img = encodebytes(byte_arr.getvalue()).decode('ascii') # encode as base64
return {'image': encoded_img}
After the algorithm has run, we encode the output image with faces swapped. The face_swap()
method end here and the output image is returned to the client.
Finally, we run the app we created by providing it with the host IP and the port number. Now, we are moving on to client-side development.
app.run(host="0.0.0.0", port=5000)
Small client app
In another file, let’s create a small script to test our app.
import requests
from base64 import encodebytes
from PIL import Image
import io
import matplotlib.pyplot as plt
addr = 'http://localhost:5000'
face_swap_url = addr + '/api/face_swap'
Then, we save the address of our server in a variable which can be called anytime we want to access the web app. Next, we save the address of the server having the face swapping model.
def encode_image(image_path):
pil_img = Image.open(image_path, mode='r') # reads the PIL image
byte_arr = io.BytesIO()
pil_img.save(byte_arr, format='PNG') # convert the PIL image to byte array
encoded_img = encodebytes(byte_arr.getvalue()).decode('ascii') # encode as base64
return encoded_img
# send http request with image and receive response
response = requests.post(face_swap_url, json={'face':encode_image('./source.jpg'),'body':encode_image('./destination.jpg')})
Now, we create a method to read and encode the images into base64. Next, we send a post request to our server with both of our images encoded into base64 and then put into a json object. We also provide the URL of our server having the face swapping model. This returns us with a response which contains the processed output image.
#extract image from response
img = response.json()['image']
img = base64.b64decode(img)
img = Image.open(io.BytesIO(img))
plt.imshow(img)
Finally, we decode the received data from base64 to image and display the output.

Example output
This is the response of the server which is consistent with the face swapping algorithm we created.
To properly run this API model, you first need to run the server, this makes sure that your web app is up and running and now anyone on the local network can access your server. When the server is on, you can run the client program to test the API.
Conclusion
An API is crucial for any application since it provides the gateway for the connection of the user interface with the back-end server. The API defines how the interface will connect to the back-end server and what functions will be performed.
There are many tools for creating an API. Flask is one of them. Flask is a Python based web microframework which can be used to create websites and web servers easily using Python.
As demonstrated, it is extremely easy to create a simple API using Flask and it gives good and fast results.
Thanks for reading!
If you liked what you saw, please support my work!

Osama Akhlaq
Avid tech enthusiast, seasoned squash player, I am a well-grounded and self-oriented computer scientist turned Business Analyst.