Representative and role model of the web page content

Бесплатный доступ

Today automatic analysis of the web page content is a topical problem. The analysis enables us to solve several practical problems, including detecting the role structure of a page content. Here we can distinguish the main page article, comments of website visitors, advertisements, and other functions. In addition, solving this problem is an important step towards a more profound automatic analysis of website semantic in the future. We have applied the approach defining the role of some html-code fragment in accordance with the way it is represented on the screen, which corresponds to the human way of perception. The developed model allows us to distinguish such html-code fragments acting as the main header and the main article of a page. The main article may contain different elements, such as a text, tables, images, etc. Often other elements (advertisements etc.) are deleted from the main article, and various ways of placing content elements on the screen and page layouts may be applied...

Еще

Web, modeling, artificial intelligence

Короткий адрес: https://sciup.org/14116887

IDR: 14116887

Статья научная