Our Goal
As precise computer vision develops, thousands of open-source datasets emerge on the Internet. However, many lack annotations or are poorly labeled with bounding boxes, weakening their effectiveness in developing modern industrial visual recognition models. The community cannot advance without high-quality images accompanied by precise annotations.
Recognizing these issues, Industrial-iSeg aims to collect and relabel complex images from challenging industrial tasks to assist R&D in precise visual recognition models and their applications spanning various sectors.
How do we feature images?
Industrial-iSeg is a non-profit dataset founded by two high school students. Due to very limited funding, we are unable to afford the costs to relabel massive number of images. Hence, we utilize advanced image filtering techniques to select images that are highly complex or valuable to industry.
We downloaded 45 popular industrial open-source datasets, both with and without annotations, and utilized an unsupervised ConvNet-based method called the “Unsupervised Activation Energy” (UAE) Metric to evaluate the complexity of each image.
Filtering out datasets with lower average complexity indexes left us with 20 high-merit datasets.
Within these 20 datasets, we ranked images in descending order of complexity and selected the top 160~320 images for each application scenario, totaling 6,400 images.
The dataset was then randomly split into a training and test set of 5,120 images and a validation set of 1,280 images.
For datasets that already include “Ground Truth” binary black-and-white labels for each defect image, we use the following code to convert them into .txt format readable for most object segmentation models such as YOLO. Hope this helps you!
import cv2
import os
def convert_mask_to_yolo(mask_path, output_txt_path, object_class=0):
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
image_height, image_width = mask.shape
_, binary_mask = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(binary_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
with open(output_txt_path, 'w') as f:
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
x_center = (x + w / 2) / image_width
y_center = (y + h / 2) / image_height
width = w / image_width
height = h / image_height
f.write(f"{object_class} {x_center} {y_center} {width} {height}\n")
mask_directory = 'path/to/mask/images'
output_directory = 'path/to/output/txts'
for filename in os.listdir(mask_directory):
if filename.endswith('.png'):
mask_path = os.path.join(mask_directory, filename)
output_txt_path = os.path.join(output_directory, os.path.splitext(filename)[0] + '.txt')
convert_mask_to_yolo(mask_path, output_txt_path)
Project inspired by

ImageNet @ Deng et al. (2009)

MS-COCO @ Lin et al. (2014)
Raw data sourced from
Tianchi Aluminum Alloy Defect Dataset
Global Wheat Detection Dataset
Surface Defect Saliency of Magnetic Tile
We deeply appreciate their data! Industrial-iSeg wouldn’t exist without their contribution.
