Methods of automated image markup and keyword finding

Бесплатный доступ

This article offers a methodology for automated markup of images and finding keywords for them. In practice it is often necessary to understand the essence of what is shown in the picture and translate it into text format. This is needed for classification, clustering and other tasks, such as making a text description for a photo. The main problem here is that modern neural networks are usually trained to recognize a certain number of classes (usually 1000). This is often not enough to get a quality text description of an image. Our world is much more complex. This article shows a technique for finding the keywords that most closely match the description of the image. For this purpose the proximity between vector image and vector word is calculated. Those vectors of words, which appear to be closest to the vector image and will be used as keywords. And also, the article compares with the usual classification of 1000 classes of the image on the ImageNet dataset.

Еще

Openai, clip, pytorch, imagenet, python

Короткий адрес: https://sciup.org/170196729

IDR: 170196729   |   DOI: 10.24412/2500-1000-2022-11-2-115-120

Статья научная