Part 2 : Dynamizing an OpenCV window une fenêtre OpenCV
In the first part we designed our static interface structure for segmentation of a dataset. Program that can be launched from the bash.
$ python main.py path/to/image/folder path/to/target/folder path/to/config.csv
Let's have a littke reminder on what we already did and what is remaining :
def __init__(self, x_save_dir, y_save_dir, config_save_path):
"""
Here the last attributes for the configuration
"""
# Init the variables to manually interact with the window
# reference point memory list
self.ref_p = []
# activation of the eraser
self.brush = False
# brush size to erase manually
self.brush_size = 20
# current mouse position
self.mouse_pos = [0, 0]
# left mouse button is pressed
self.l_pressed = False
# Switch between zoom
self.zoom_factor = 3
# Set the window as normal
cv2.namedWindow("GUI", cv2.WINDOW_NORMAL)
cv2.setMouseCallback("GUI", self.mouse_event)
# Set the window in full screen
cv2.setWindowProperty("GUI", 1, 1)
def mouse_event(self, *args):
"""Able to the user to manually interact
with the images and their target"""
pass
First of all for this second part of the tutorial we will need new attributes. These will be the last ones we'll have to add. They will be used to keep in memory all the positions of the different clicks, to activate / deactivate the eraser, the zoom or to know the position of the mouse in real time. Finally, we add an OpenCV window that will call the mouse_event() function at each mouse event. We will implement this function later.
Interact with keyboard keys
The purpose of this section is to design the interaction between the interface and the user from the keyboard keys. So we will define them at the top of our python script.
keys = {
'next_channel': ord('e'), # ASCII number = 101
'previous_channel': ord('d'), # ASCII number = 100
'next_image': ord('f'), # ASCII number = 102
'previous_image': ord('s'), # ASCII number = 115
'zoom': ord('z'), # ASCII number = 122
'validate': 13, # ASCII number = 13
'undo': ord('u'), # ASCII number = 117
'brush': ord('b'), # ASCII number = 98
'delete': 8, # ASCII number = 8
'quit': ord('q') # ASCII number = 113
}
class objectview(object):
def __init__(self, d):
self.__dict__ = d
# to transform the dictionary into an object
keys = objectview(keys)
Here we can redefine the commands we like. For my part I made this choice because I'm on an AZERTY keyboard. It is a personal command choice that I invite you to change if it doesn't satisfy you. To change them, the keyboard keys are indicated by integers between 0 and 255 respecting the ASCII table. You will notice that I have transformed the dictionary into an object. This is done for clarity during implementation. We can access the next_image value by writing "keys.next_image" instead of "keys['next_image']".
All the processing of the functions assigned to the keyboard keys will be done in the run() function which can be rewritten as below. So we make a list of the commands activated by the keyboard keys. The called functions will be implemented afterwards.
def run(self):
while True:
# Display the current gui frame
cv2.imshow("GUI", self.get_frame())
# Continuously wait for a pressed key
key = cv2.waitKey(1) & 0xFF
# activate/deactivate the zoom in the gui and its factor
if key == keys.zoom:
# Switch between zoom factors
self.zoom_factor = 1 + self.zoom_factor % 5
# if the enter key is pressed for an undefined contour target
# draw it from references points
if key == keys.validate and not self.shapes[self.channel]:
self.set_poly()
# remove the last reference point by pressing the return key
elif key == keys.undo:
self.ref_p = self.ref_p[:-1] if len(self.ref_p) else []
# activate/deactivate the brush eraser
elif key == keys.brush:
self.brush = not self.brush
# go to the next image
elif key == keys.next_image:
self.update_image(1)
# go to the previous image
elif key == keys.previous_image:
self.update_image(-1)
# go to the next channel
elif key == keys.next_channel:
self.update_channel(1)
# go to the previous channel
elif key == keys.previous_channel:
self.update_channel(-1)
# delete the current image and its target
elif key == keys.delete:
self.delete()
# exit the gui if the 'q' key is pressed
elif key == keys.quit:
# think to save before to quit
self.save()
break
So we have the basis to call methods to switch from one image to another, from one class to another, activate / deactivate the eraser, zoom etc..
All we need to do is to implement the mouse_event() function to have all the requested interactions. Then we will be able to implement the actions to be performed depending on the interaction.
def mouse_event(self, event, x, y, flags, param):
"""
Able to the user to manually interact with the images and their target
:param event: event raised from the mouse
:param x: x coordinate of the mouse at the event time
:param y: y coordinate of the mouse at the event time
:param flags: flags of the event
:param param: param of the event
"""
# mouse move
if event == cv2.EVENT_MOUSEMOVE:
# update mouse position
self.mouse_pos = [x, y]
# check if the left button is pressed to have an action
if self.l_pressed:
# erase the activated pixels in the channel
if self.brush:
self.set_brush_eraser()
# else draw pixel if it is `pixel by pixel` draw shape
elif self.shapes[self.channel] == 1:
self.set_pixel()
if event == cv2.EVENT_LBUTTONUP:
self.l_pressed = False
# left button pressed
if event == cv2.EVENT_LBUTTONDOWN:
self.l_pressed = True
# check if the brush eraser is activated
if self.brush:
self.set_brush_eraser()
# check if the click is inside the image
elif 0 <= x < self.X.shape[1] and 0 <= y < self.X.shape[0]:
# check if it is a pixel by pixel draw shape
if self.shapes[self.channel] == 1:
self.set_pixel()
else:
# append the current point to the history
self.ref_p.append([x, y])
# check if the channel is an unlimited contour shape
if self.shapes[self.channel]:
# check if the channel is a circle draw shape
# and if the two points are given
if self.shapes[self.channel] == len(self.ref_p) == 2:
self.set_circle()
# else wait to reach the number of references points
# given by the shape of the channel
elif self.shapes[self.channel] == len(self.ref_p):
self.set_poly()
This method will mainly allow us to draw our targets according to the type of shape that defines the class. It checks whether the eraser is activated or not and that the clicks are well done on the image. And speaking of the eraser, it would be nice to be able to see its location when it is activated, as well as the zoom and reference points. We just need to add the following code in the get_frame() function before concatenating the images.
# get the associated color to the current channel
color = self.hex2tuple(self.colors[self.channel], normalize=True)
# draw each reference points
for pt in self.ref_p:
cv2.circle(x_img, tuple(pt), 2, color, cv2.FILLED)
# draw the diameter of the circle if it is the channel mode shape
# and there is one ref point
if self.shapes[self.channel] == 2 and len(self.ref_p):
self.draw_circle_visualization(x_img, y_img, color)
# draw the polygone if it is the channel mode shape
elif self.shapes[self.channel] != 1 and len(self.ref_p):
self.draw_poly_visualization(x_img, y_img, color)
# draw the brush if the right mouse button is pressed
if self.brush:
self.draw_brush(x_img, y_img, color)
# if the zoom is activated
if self.zoom_factor > 1:
self.draw_zoom_window(x_img, y_img)
What about the other stuff ?
If you have copied and pasted the above lines of code and tried to start the program, you will have noticed that some functions are not yet defined. They are there! At this stage you can already place reference points wherever you want and delete them according to your commands (in my case, the 'u' key).
The rest of the methods will perform all the actions indicated by the user. Luckily, they are very short and quick to implement.
def update_image(self, *args):
"""Update the current image"""
pass
def update_channel(self, *args):
"""Update the current channel"""
pass
def set_pixel(self, *args):
"""Set value to 1 at the mouse position
in the case of `1` draw shape"""
pass
def set_poly(self, *args):
"""Draw a polynomial from the points given its contours"""
pass
def set_circle(self, *args):
"""Draw a circles with the two reference points in memory
considering they give us the diameter of the circle"""
pass

def set_brush_eraser(self, *args):
"""Erase the target image
according to the brush size and the mouse position"""
pass

def draw_poly_visualization(self, *args):
"""Draw the lines in the case of polygones to visualize it"""
pass
def draw_circle_visualization(self, *args):
"""Draw the circle in the case of `2` draw typ to visiualize
the circle before to set the 2nd and last refernce point"""
pass
def draw_brush(self, *args):
"""Draw the brush if activated on the images"""
pass
def draw_zoom_window(self, *args):
""" Draw the zoom window if activated on the input image
and draw its position on the target image"""
pass
Let's start by being able to move from one image or class to another with the keyboard.
def update_image(self, increment=0):
# save the previous one and its target
self.save()
# update the index of the image
self.n = (self.n + increment) % len(self.X_paths)
# load the new current image
self.load()
# remove the potential references points
self.ref_p = []
def update_channel(self, increment=0):
# update the index of the current channel
self.channel = (self.channel + increment) % self.n_class
# remove the potential references points
self.ref_p = []
We can now use our keyboard to move to our image folder and browse through the different classes.
Let's draw !
According to the different types of shapes specified in our config.csv file that we saw in the first part, we therefore need to define different types of drawing: pixel by pixel, with these circles or with polynomials.
def set_pixel(self):
# draw the pixel according at the mouse position
x, y = self.mouse_pos
self.Y[y, x, self.channel] = 1
def set_poly(self):
if len(self.ref_p) > 1:
# create a new mask
mask = np.zeros((*self.Y.shape[:2], 3))
# get the contours points
pts = np.array([[pt] for pt in self.ref_p])
# fill the contours in the mask as bitwise then get one channel
cv2.fillPoly(mask, pts=[pts], color=(1, 1, 1))
mask = mask[:, :, 0]
mask = [[1 * (value or mask[i, j])
for j, value in enumerate(l)]
for i, l in enumerate(self.Y[:, :, self.channel])]
# update the segmented target
self.Y[:, :, self.channel] = np.array(mask)
# remove the points
self.ref_p = []
def set_circle(self):
# get the two references points in memory
p1, p2 = self.ref_p
# get the center of the circle and its radius
# according to the two bordered selected points
cx = (p1[0] + p2[0]) // 2
cy = (p1[1] + p2[1]) // 2
r = ((p1[0] - p2[0]) ** 2 + (p1[1] - p2[1]) ** 2) ** 0.5 / 2
# get the mask of the circle
mask = [[1 * (value or ((j - cx) ** 2 + (i - cy) ** 2 < r ** 2))
for j, value in enumerate(line)]
for i, line in enumerate(self.Y[:, :, self.channel])]
# draw the circle
self.Y[:, :, self.channel] = np.array(mask)
# remove the points
self.ref_p = []
def draw_poly_visualization(self, x_img, y_img, color):
# draw a line between the first reference point and the mouse pos
cv2.line(x_img, tuple(self.mouse_pos), tuple(self.ref_p[0]), color, 1)
cv2.line(y_img, tuple(self.mouse_pos), tuple(self.ref_p[0]), color, 1)
# draw a line between the reference points
for i in range(len(self.ref_p) - 1):
cv2.line(x_img, tuple(self.ref_p[i]), tuple(self.ref_p[i + 1]), color, 1)
cv2.line(y_img, tuple(self.ref_p[i]), tuple(self.ref_p[i + 1]), color, 1)
# draw a line between the last reference point and the mouse pos
cv2.line(x_img, tuple(self.mouse_pos), tuple(self.ref_p[-1]), color, 1)
cv2.line(y_img, tuple(self.mouse_pos), tuple(self.ref_p[-1]), color, 1)
def draw_circle_visualization(self, x_img, y_img, color):
p1, p2 = self.ref_p[0], self.mouse_pos
# get the center of the circle and its radius
# according to the two bordered selected points
cx = (p1[0] + p2[0]) // 2
cy = (p1[1] + p2[1]) // 2
r = ((p1[0] - p2[0]) ** 2 + (p1[1] - p2[1]) ** 2) ** 0.5 / 2
# draw the circle to visualize the rendered segmented circle
cv2.circle(x_img, (cx, cy), int(r), color, 1)
cv2.circle(y_img, (cx, cy), int(r), color, 1)
If you have copied and pasted the above lines of code and tried to start the program, you will have noticed that some functions are not yet defined. With the exception of the last two, which do not modify it. The draw_circle_visualization() and draw_poly_visualization() functions do not alter the target. They draw on the colored image in get_frame() which gives us a preview of the reference points already placed by adding the mouse position.
We are now ready to segment our image.