Research and development of image masking strategy to improve masked autoencoder efficiency

Бесплатный доступ

The paper is devoted to the problem of improving the efficiency of masked autoencoder by developing an image masking strategy that considers the object localization in the image and hides as little semantically important information as possible. The article provides an overview of existing methods for masking images, including both considering and not considering the image structure strategies. A masking strategy based on an object detection algorithm that analyzes the elementary characteristics of image fragments is proposed. The study is carried out on the example of masked autoencoder having ViT as an encoder. The efficiency of training the encoder using the proposed strategy and using the random masking strategy is compared.

Еще

Neural networks, deep learning, self-supervised learning, masked image modeling, vit model, masked autoencoder

Короткий адрес: https://sciup.org/14133456

IDR: 14133456

Статья научная