System for processing highly specialized information in distributed networks

Автор: Zelenkov P.V., Brezitskaya V.V., Bachurina E.P., Khokhlov A.P., Karaseva M.V.

Журнал: Сибирский аэрокосмический журнал @vestnik-sibsau

Рубрика: Математика, механика, информатика

Статья в выпуске: 5 (26), 2009 года.

Бесплатный доступ

A new structure of a system for arranging and control of highly specialized information in corporate systems is proposed. The principle difference of the given structure is that it is capable to process multilinguistic information within one user's inquiry.

Distributed networks, multilingual information, information system, agent

Короткий адрес: https://sciup.org/148176094

IDR: 148176094

Текст научной статьи System for processing highly specialized information in distributed networks

At present time network technologies are actively developed. Thus the question of gathering, processing and controlling information is the most actual for the given technologies [1; 2]. The majority of users of a Russian-language segment in the Internet uses the existing search services of a general purpose. By November, 2008 the most popular information search services are: Yandex, Google, Mail, Rambler, which share 95 % of the user inquiries. Distribution of user inquiries is shown in figure 1.

However it is necessary to notice that the services mentioned above give good results while processing generally used subjects, but if it is necessary to search the highly specialized information then there could be difficulties. Besides in these systems the problem of multilingual representation of information in the Internet is solved incorrectly [1]. Search services of a general purpose support only the languages set which the search inquiry is set for, however, while searching the highly specialized personified information it is possible to organize multilinguistic search procedure at once [1; 2].

In order to overcome the described difficulties applying existing technologies and approaches while using the processing of multilinguistic subject-focused information is offered.

It is proposed to use well-proven technology of information-operating systems realization, based on multiagent approach.

Fig. 1. Distribution of search inquiries in Russian-language segment of the Internet

Vestnik. Scientific Journal of Siberian State Aerospace University named after academician M. F. Reshetnev

The developed structure of interaction between the agents in the offered multi-agent corporate system is presented in figure 2.

Fig. 2. Generalized diagram of the proposed multi-agent system

Apparently it is necessary to construct four logically connected program modules (agents). The function and structure of each module will be given below.

The Interface Agent is responsible for the arrangement of the user’s work with an information processing system and, that is clearly shown in the figure, it is connected with two agents (the Information Search Agent and the Information Processing Agent). The given agent is simple in structure and execution.

The Information Search Agent needs more detailed description because it is offered to realize it within metasearch multilinguistic execution. The structure of the given agent is shown in figure 3.

It’s evident from the structure of the given agent, that it is initial. Its basic task is to process a search line of the user which is received by the agent from the interface agent. After the line was processed, it is necessary to initialize multilinguistic meta-search procedure both in a corporate network and in the Internet network. Further the processes of documents presence check and backups removal are realized. Afterwards all received sample of the information is transferred to the Information Processing Agent. The structure of the given agent is shown in figure 4.

The given agent is responsible for control of the information in topical collection obtained during the step of search, from the point of the corporate system user.

You can see that the agent consists of following components:

  • –    the Information-Operating Agent (functionally it is the main agent of the procedure);

  • –    two agents which are rigidly connected to each other (the Agent for Relevance Determination of and the Agent for Correlation of the Document to Subject Domain);

  • –    theAgentforInformationRanging;

  • –    the agent for information processing and imaging.

Let’s consider more details of Agent for Relevance Determination and Agent for Correlation of the Document to Subject Domain. The Agent for Relevance Determination makes determination for the documents from the offered sample. Applying algorithms for determination of relevance it is possible to show that some documents are “quasi more relevant to inquiry”, and some are “less relevant”. Thus, there is a problem of processing of conditionally relevant documents (documents from an adjacent subject domains). Although, while searching it is necessary to consider the possibility of appearance of such subject domains in a resulting sample. Taking into account the adjacency of the subject domains, it is necessary to consider preferences of the user and to solve a problem of adding or an extracting of documents from adjacent subject domains in a resulting

Fig. 3. Process diagram of the “search agent”

sample (the given task is solved by the agent for correlation of the document to subject domain).

Besides, in the chosen text it is possible to have not all the document relevant to a subject domain, but only its part, for example, separate units from the general purpose textbooks, separate articles from collections of articles, parts from reports of the organizations, etc. Considering the given restriction, it is necessary to make a decision of presenting to a user only a part of information necessary for him. The next agent which is offered to be considered in more details is an Agent of Ranking. It is not less important while information processing because at presentation several thousand documents to a user the first place in the display list should be taken by the most important documents.

The Subject-Focused Monitoring Agent is responsible for the analysis of information preferences of a user of corporate system in framework of topic collections and presentation of the personified support of navigation and personified data. Due to presenting to users information collections of the personified navigating menus which are references to the pages, close to their topic preferences, time necessary for searching the requested information and the user traffic is reduced – either in a corporate or external network, because of viewing only high-quality information.

In this way, the offered solution should increase the convenience of the user’s work with information resources of corporate system and serve as an additional stimulus for visiting the information collections more often. Besides, the offered approach should lower a loading, both on internal corporate traffic, and on external traffic essentially.

Статья научная