The hyper-network of scientific co-authorship. DB REPEC data analysis

Автор: Bredikhin Sergey, Lyapunov Victor, Scherbakova Natalia

Журнал: Проблемы информатики @problem-info

Рубрика: Прикладные информационные технологии

Статья в выпуске: 4 (57), 2022 года.

Бесплатный доступ

The article deals with the modeling of a complex network of co-authorship, presented in the form of a hypergraph. The formal information necessary to describe multiple relationships between coauthors is given and two models of the analyzed object are presented. Based on real information extracted from the bibliographic database RePEc, a hypergraph of the co-authorship network is constructed. Most of the previous studies consider the co-authorship relation between two authors as a collaboration. So a network is represented as a simple graph in which link relates only a pair of authors that are coauthors of at least one scientific paper (SP). These pairwise networks have been studied from many aspects such as degree distribution analysis, community extraction, authors ranking, see, for example, [4-8]. Such networks does not provide a complete description of the collaboration because we only know whether scientists have collaborated or not but we can’t know whether a group of authors linked together in the network were coauthors of the same paper or not. As a variant of the representation that takes into account n-ary relations between authors, a bipartite graph may be considered, in which one partite set represents the authors, the other - SPs prepared by these authors. This makes it possible to use the apparatus of graph theory, but at the same time, heterogeneity in the definition of nodes makes more complicated the study of such topological properties as connectivity and clustering. Therefore, in [10], it is proposed to use a graph generalization, a hypergraph [11], to represent a complex system and call it a hyper-network. Edges of a hypergraph can relate groups of more than two nodes. A (undirected) hypergraph H = (V, E ) on a finite set V = v1,v2,... ,vn is defined by the family E = (E1,E2,... ,Em) of subsets of the set V. An element Vi E V is called a node, an element Ei E E is called an ( hyper)edge [17]. Let P = {p1,p2,... ,pm} be the set of SPs, and S = {s1, s2, ..., sn } be the set of their authors. We assume that P contains only those SPs that have two or more authors, i.e. the constructed hypergraph will not have edges consisting of a single vertex. Let us define a hypergraph H1 = (V, E1) such that the set S is mapped to the set of vertices V, and the set P is mapped to the set of edges E i. and if the SP pi is prepared precisely by the authors v1, v2,..., v^, the n Ej = {v1,v2,... ,vk } is an edge, Ei E E1. The number of edges m1 = |E1| is the number of publications |P| [10]. We can also define a hypergraph H2 = (V, E2,w) in which nodes represent authors and hyper-edges represent the groups of authors that have published papers together. Here Ei = {v1,v2,... ,vk } E E2 if there is at least one SP jointly published by the authors v1 ,v2,... ,vk- The edge weight is the number of SPs published jointly by these к authors. Number of edges m2 = |E2| is the number of groups of authors [6]. In our work, we consider a set of SPs indexed in the RePEc database at the time of extraction. The procedure for filtering “raw” data is presented in [15]. As a result, having 91113 co-authored SPs and 32434 authors we construct the hypergraph Hca = (V, E ) by analogy with H-1 above. At this stage we use the bipartite incidence graph K(Hca) = (V,V',EK ) in order to calculate a number of parameters of H ca . The graph K(Hca) that is isomorphic to Hca can be obtained by associating with each hyper-edge Ej G E an additional vertex vej and defining the set V' = {vej : Ej G E} such that an edge between v G V and vej G V 'exists iff v G Ej [24]. It is shown that the hypergraph Hca is neither simple nor conformal. Parameter values are given in Tab. 2. As an example, we consider the hypergraph component consisting of 12 nodes and 27 edges (Fig. 1, Tab. 1). It is noted that based on the hypergraph, co-authorship networks considered in the works [15, 16] can be built, the reverse is not true.

Еще

Complex networks, scientific co-authorship, hypergraph, bipartite graph, bibliometry

Короткий адрес: https://sciup.org/143179896

IDR: 143179896   |   DOI: 10.24412/2073-0667-2022-4-70-83

Статья научная