Industrial-iSeg Dataset

Our Goal

As precise computer vision develops, thousands of open-source datasets emerge on the Internet. However, many lack annotations or are poorly labeled with bounding boxes, weakening their effectiveness in developing modern industrial visual recognition models. The community cannot advance without high-quality images accompanied by precise annotations.

Recognizing these issues, Industrial-iSeg aims to collect and relabel complex images from challenging industrial tasks to assist R&D in precise visual recognition models and their applications spanning various sectors.

How do we feature images?

Industrial-iSeg is a non-profit dataset founded by two high school students. Due to very limited funding, we are unable to afford the costs to relabel massive number of images. Hence, we utilize advanced image filtering techniques to select images that are highly complex or valuable to industry.

We downloaded 45 popular industrial open-source datasets, both with and without annotations, and utilized an unsupervised ConvNet-based method called the “Unsupervised Activation Energy” (UAE) Metric to evaluate the complexity of each image.

Filtering out datasets with lower average complexity indexes left us with 20 high-merit datasets.

Within these 20 datasets, we ranked images in descending order of complexity and selected the top 160~320 images for each application scenario, totaling 6,400 images.

The dataset was then randomly split into a training and test set of 5,120 images and a validation set of 1,280 images.

For datasets that already include “Ground Truth” binary black-and-white labels for each defect image, we use the following code to convert them into .txt format readable for most object segmentation models such as YOLO. Hope this helps you!

import cv2
import os

def convert_mask_to_yolo(mask_path, output_txt_path, object_class=0):
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    image_height, image_width = mask.shape
    _, binary_mask = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)
    contours, _ = cv2.findContours(binary_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    with open(output_txt_path, 'w') as f:
        for cnt in contours:
            x, y, w, h = cv2.boundingRect(cnt)
            x_center = (x + w / 2) / image_width
            y_center = (y + h / 2) / image_height
            width = w / image_width
            height = h / image_height
            f.write(f"{object_class} {x_center} {y_center} {width} {height}\n")

mask_directory = 'path/to/mask/images'
output_directory = 'path/to/output/txts'

for filename in os.listdir(mask_directory):
    if filename.endswith('.png'):
        mask_path = os.path.join(mask_directory, filename)
        output_txt_path = os.path.join(output_directory, os.path.splitext(filename)[0] + '.txt')
        convert_mask_to_yolo(mask_path, output_txt_path)

Project inspired by