A hands-on example to label video data segmentation using the Watershed algorithm – Labeling Video Data-1

In this example code, we will implement the following steps:

  1. Import the required Python libraries for segmentation.
  2. Read the video data and display the original frame.
  3. Extract frames from the video.
  4. Apply the Watershed algorithm to each frame.
  5. Save the segmented frame to the output directory and print the segmented frame.
    Here’s the corresponding code:
    First, let’s import the required libraries: Step 1: Importing the required python libraries
    import cv2
    import numpy as np
    from matplotlib import pyplot as plt
    Step 2: Read the Video Data

Let’s read the video data from the input directory, extract the frames for the video, and then print the original video frame:

video_path = “/datasets/Ch9/Kinetics/dance/dance3.mp4”
Check if the file exists
if os.path.exists(video_path):
cap = cv2.VideoCapture(video_path)
Continue with your video processing logic here
else:
print(f”The file ‘{video_path}’ does not exist.”)

In this step, we specify the path to the video file and create an instance of the VideoCapture class from OpenCV to read the video data:

Step 3: Extract Frames from the Video
frames = []
while True:
ret, frame = cap.read()
if not ret:
break
frames.append(frame)
cap.release()

This step involves iterating through the video frames using a loop. We use the cap.read() method to read each frame. The loop continues until there are no more frames left in the video. Each frame is then stored in the frames list for further processing:

Display the first one original frame for sample
plt.imshow(cv2.cvtColor(frames[0], cv2.COLOR_BGR2RGB))
plt.title(‘Original Frame’)
plt.axis(‘off’)
plt.show()
Step 4: Apply Watershed Algorithm to Each Frame

This step involves applying the Watershed algorithm to each frame of the video. Here’s a breakdown of the sub-steps:

  1. Convert the frame to grayscale using cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY). This simplifies the subsequent processing steps.
  2. Apply thresholding to obtain a binary image. This is done using cv2.threshold() with the cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU flag. The Otsu thresholding method automatically determines the optimal threshold value.
  3. Perform morphological operations to remove noise and fill holes in the binary image. Here, we use cv2.morphologyEx() with the cv2.MORPH_OPEN operation and a 3×3 kernel. This helps to clean up the image.
  4. Apply the distance transform to identify markers. This is done using cv2.distanceTransform(). The distance transform calculates the distance of each pixel to the nearest zero-valued pixel in the binary image.
    Let’s take a look at the code for the aforementioned sub-steps:

labeled_frames = []
for frame in frames:
Convert the frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

The input frame is converted to grayscale (gray), which simplifies the subsequent image-processing steps:

 Apply thresholding to obtain a binary image
_, thresh = cv2.threshold(gray, 0, 255, \
    cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

A binary image (thresh) is created using Otsu’s method, which automatically determines an optimal threshold for image segmentation. The cv2.THRESH_BINARY_INV flag inverts the binary image, making foreground pixels white:

 Perform morphological operations to remove noise and fill holes
kernel = np.ones((3, 3), np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)
sure_bg = cv2.dilate(opening, kernel, iterations=3)

Morphological opening is applied to the binary image (thresh). opening is a sequence of dilation followed by erosion. It is useful for removing noise and small objects while preserving larger structures. kernel is a 3×3 matrix of ones, and the opening operation is iterated twice (iterations=2). This helps smooth out the binary image and fill small gaps or holes.
The result of the opening operation (opening) is further dilated (cv2.dilate) three times using the same kernel. This dilation increases the size of the white regions and helps to create a clear distinction between the background and the foreground. The resulting image is stored as sure_bg.

Leave a Reply

Your email address will not be published. Required fields are marked *

*