Binary Segmentation Dataset Distances for Transfer Learning

Автор: Victor Sineglazov, Kirill Riazanovskiy, Olexander Klanovets

Журнал: International Journal of Image, Graphics and Signal Processing @ijigsp

Статья в выпуске: 3 vol.17, 2025 года.

Бесплатный доступ

This work is devoted to developing a novel transfer learning approach for solving binary semantic segmentation problems that often arise on short samples in the medical (segmentation of nodules in lungs, tumors, polyps, etc.) and other domains. The goal is to optimally select the most suitable dataset from a different subject area with similar feature space and distribution to the target data. Examples show that a severe disadvantage of transfer learning is the difficulty of selecting an initial training sample for pre-training a neural network. In this paper, we propose metrics for calculating the distance between binary segmentation datasets, allowing us to select the optimal initial training set for transfer learning. These metrics are based on the geometric distances estimation of the dataset using optimal transport, Wasserstein distance for Gaussian mixture models, clustering, and their hybrids. Experiments on datasets of medical segmentation Decathlon, LIDC, and a private dataset of tuberculomas in the lungs are presented, proving a statistically strict correlation of these metrics with a relative increase in segmentation accuracy during transfer learning.

Еще

Binary Semantic Segmentation, Transfer Learning, Dataset Distance, Optimal Transport, Gaussian Mixture Models, Clustering

Короткий адрес: https://sciup.org/15019729

IDR: 15019729   |   DOI: 10.5815/ijigsp.2025.03.07

Статья научная