Гиперсеть научного соавторства. Анализ данных БД REPEC

Автор: Бредихин Сергей Всеволодович, Ляпунов Виктор Михайлович, Щербакова Наталья Григорьевна

Журнал: Проблемы информатики @problem-info

Рубрика: Прикладные информационные технологии

Статья в выпуске: 4 (57), 2022 года.

Бесплатный доступ

Рассмотрены вопросы моделирования комплексной сети научного соавторства, представленной в виде гиперграфа, в отличие от традиционного подхода к изучению этого феномена, базирующегося на построении взвешенного либо невзвешенного графа. Приведены формальные сведения, необходимые для описания множественных отношений между группами соавторов, представлены две модели анализируемого объекта. На основе реальной информации, извлеченной из библиографической базы данных, сконструирован гиперграф сети соавторства, измерены его параметры и сформулированы основные свойства. Приведен содержательный пример. В результате работы феномен научного соавторства рассмотрен с новой точки зрения.

Еще

Комплексные сети, научное соавторство, гиперграф, двудольный граф, библиометрия

Короткий адрес: https://sciup.org/143179896

IDR: 143179896   |   УДК: 519.177   |   DOI: 10.24412/2073-0667-2022-4-70-83

The hyper-network of scientific co-authorship. DB REPEC data analysis

The article deals with the modeling of a complex network of co-authorship, presented in the form of a hypergraph. The formal information necessary to describe multiple relationships between coauthors is given and two models of the analyzed object are presented. Based on real information extracted from the bibliographic database RePEc, a hypergraph of the co-authorship network is constructed. Most of the previous studies consider the co-authorship relation between two authors as a collaboration. So a network is represented as a simple graph in which link relates only a pair of authors that are coauthors of at least one scientific paper (SP). These pairwise networks have been studied from many aspects such as degree distribution analysis, community extraction, authors ranking, see, for example, [4-8]. Such networks does not provide a complete description of the collaboration because we only know whether scientists have collaborated or not but we can’t know whether a group of authors linked together in the network were coauthors of the same paper or not. As a variant of the representation that takes into account n-ary relations between authors, a bipartite graph may be considered, in which one partite set represents the authors, the other - SPs prepared by these authors. This makes it possible to use the apparatus of graph theory, but at the same time, heterogeneity in the definition of nodes makes more complicated the study of such topological properties as connectivity and clustering. Therefore, in [10], it is proposed to use a graph generalization, a hypergraph [11], to represent a complex system and call it a hyper-network. Edges of a hypergraph can relate groups of more than two nodes. A (undirected) hypergraph H = (V, E ) on a finite set V = v1,v2,... ,vn is defined by the family E = (E1,E2,... ,Em) of subsets of the set V. An element Vi E V is called a node, an element Ei E E is called an ( hyper)edge [17]. Let P = {p1,p2,... ,pm} be the set of SPs, and S = {s1, s2, ..., sn } be the set of their authors. We assume that P contains only those SPs that have two or more authors, i.e. the constructed hypergraph will not have edges consisting of a single vertex. Let us define a hypergraph H1 = (V, E1) such that the set S is mapped to the set of vertices V, and the set P is mapped to the set of edges E i. and if the SP pi is prepared precisely by the authors v1, v2,..., v^, the n Ej = {v1,v2,... ,vk } is an edge, Ei E E1. The number of edges m1 = |E1| is the number of publications |P| [10]. We can also define a hypergraph H2 = (V, E2,w) in which nodes represent authors and hyper-edges represent the groups of authors that have published papers together. Here Ei = {v1,v2,... ,vk } E E2 if there is at least one SP jointly published by the authors v1 ,v2,... ,vk- The edge weight is the number of SPs published jointly by these к authors. Number of edges m2 = |E2| is the number of groups of authors [6]. In our work, we consider a set of SPs indexed in the RePEc database at the time of extraction. The procedure for filtering “raw” data is presented in [15]. As a result, having 91113 co-authored SPs and 32434 authors we construct the hypergraph Hca = (V, E ) by analogy with H-1 above. At this stage we use the bipartite incidence graph K(Hca) = (V,V',EK ) in order to calculate a number of parameters of H ca . The graph K(Hca) that is isomorphic to Hca can be obtained by associating with each hyper-edge Ej G E an additional vertex vej and defining the set V' = {vej : Ej G E} such that an edge between v G V and vej G V 'exists iff v G Ej [24]. It is shown that the hypergraph Hca is neither simple nor conformal. Parameter values are given in Tab. 2. As an example, we consider the hypergraph component consisting of 12 nodes and 27 edges (Fig. 1, Tab. 1). It is noted that based on the hypergraph, co-authorship networks considered in the works [15, 16] can be built, the reverse is not true.

Еще

Список литературы Гиперсеть научного соавторства. Анализ данных БД REPEC

  • Boccaletti S., Latora V., Moreno Y., Chavez M., Hwang D. U. Complex networks: Structure and dynamics // Phys. Rep. 2006. V. 424, iss. 4-5. P. 175-308. DOI: 10.1016/j.physrep.2005.10.009.
  • Battiston F., Cencetti G., Iacopini I., Latora V., Lucas M., Patania A. Young J-G., Petri G. Networks beyond pairwise interactions: Structure and dynamics // Phys. Rep. 2020. V. 874. P. 1-92. DOI: 10.1016/ j.physrep.2020.05.004.
  • Shcherbakova N. G. Modelirovanie gruppovykh vzaimodejstvij kompleksnykh sistem. Obzor // Problemy informatiki. 2022. N. 3. S. 24-45.
  • Newman M. E. J. Scienti-c collaboration networks. I. Network construction and fundamental results // Phys Rev. E, 64(1), 016131. DOI: 10.1103/PhysRevE.64.016131.
  • Newman M. E. J. Scienti-c collaboration networks. II. Shortest paths, weighted networks, and centrality // Phys. Rev. E, 64(1), 016132. DOI: 10.1103/PhysRevE.64.016132.
  • Savi-c M., Ivanovi-c M., Radovanovi-c M., Ognjanovi-c Z., Pejovi-c A. Exploratory analysis of communities in co-authorship networks: A case study // Intern. Conf. on ICT Innovations, Springer, 2019. P. 55-64.
  • Barab-asi A. L., Jeong H., N-eda Z., Ravasz E., Schubert A., Vicsek T. Evolution of the social network of scienti-c collaborations // Physica A. 2002. V. 311. P. 590-614. DOI: 10.48550/arXiv.cond-mat/0104162.
  • Uddin S., Hossain L., Abbasi A., Rasmussen K. Trend and e ciency analysis of coauthorship network // Scientometrics. 2012. V. 90, No. 2. P. 687-699. DOI: 10.1007/s11192-011-0511-x.
  • Newman M. E. J., Strogatz S. H., Watts D. J. Random graphs with arbitrary degree distributions and their applications // Phys. Rev. E 64, 026118. 2001. DOI: 10.1103/PhysRevE.64.026118.
  • Estrada E., Rodr-iguez-Vel-azquez J. A. Complex networks as hypergraphs // Arxiv: physics/0505137, 2005. DOI: 10.1016/j.physa.2005.12.002.
  • Berge C. Hypergraphs. Amsterdam; N. Y.; Oxford; Tokyo: North-Holland, 1989.
  • Torres L., Blevins A. S., Basset D., Eliassi-Rad T. The why, how, and when of representations for complex systems // SIAM Rev. 2021. V. 63, No 3. P. 415-485. DOI: 10.1137/20M1355896.
  • Ouvrard X., le Goff X-M., Marchand-Maillet S. Networks of collaborations: Hypergraph modeling and visualization // ArXiv: 1809.00164v1. DOI: 10.48550/arXiv.1809.00164.
  • Han Y., Zhou B., Pei J., Jia Y. Understanding importance of collaborations in coauthorship networks: A supportiveness analysis approach // Proc. 2009 SIAM Intern. Conf. on Data Mining. 2009. P. 1112 1123. DOI: 10.1137/ 1.9781611972795.95.
  • Bredikhin S.V., Lyapunov V. M., Shcherbakova N. G. Struktura i parametry nevzveshennoj seti soavtorstva na osnove dannykh BD RePEc // Problemy informatiki. 2021. N. 3. S. 56-67. DOI: 10.24411/2073-0667-2021-3-56-57.
  • Bredikhin S. V., Lyapunov V. M., Shcherbakova N. G. Ranzhirovanie uzlov vzveshennoj seti soavtorstva: analiz dannykh BD RePEc // Problemy informatiki. 2021. N. 4. S. 5-15. DOI: 10.24412/2073-0667-2021-4-67-83.
  • Voloshin V. I. Introduction to graph and hypergraph theory. N. Y.: Nova Science Publishers, Inc., 2009.
  • Bretto A. Hypergraph theory: An introduction. Heidelberg: Springer Intern. Publishing, 2013. DOI: 10.1007/978-3-319-00080-0.
  • Mart-iinez M. G., Stark H. M., Terras A. A. Some Ramanujan hypergraphs associated to GL(n, F q) // Proc. Am. Math. Soc. 2001. V. 129, P. 1623 1629. S. 0002-9939(00)05965-7.
  • Ouvrard X. Hypergraphs: An introduction and review // Arxiv: 2002.05014v2, 2020. DOI: 10.48550/arXiv.2002.05014.
  • Zhou D., Huang J., Sch¨okopf B. Learning with hypergraphs: Clustering, classi-cation, and embedding // Proc. 19th Internat. Conf. on Neural Inform. Proc. Syst. 2007. P. 1601 1608. DOI: 10.7551/mitpress/7503.003.0205.
  • Bahmanian M. A., Sajna M. Connection and separation in hypergraphs // Theory and Appl. of Graphs. 2015. V. 2, iss. 2. DOI:10.20429/tag.2015.020205.
  • Borgatti S. P., Everett M. G. Network analysis of 2-mode data // Social networks. 1997. V. 19. P. 243-269. DOI: 10.1016/S0378-8733(96)00301-2.
  • Cooper J., Dutle A. Spectra of uniform hypergraphs // Lin. Algebra and Its Appl. 2012. V. 436. P. 3268-3292. DOI: 10.48550/arXiv.1106.4856.
  • Banerjee A., Char A., Mondal B. Spectra of general hypergraphs // Lin. Algebra and Its Appl. 2017. V. 518. P. 14-30. DOI: 10.1016/j.laa.2016.12.022.
  • Kumar T., Vaidyanathan S., Ananthapadmanabhan H. Hypergraph clustering: A modularity maximization approach // ArXiv: 1812.10869[cs.G]. DOI: 10.48550/arXiv.1812.10869.
  • Kami-nski B., Poulin V., Pralat P., Szufel P., Theberge F. Clustering via hypergraph modularity // PLoS ONE. 2019. V. 14(11), e0224307. DOI: 10.1371/journal.pone.0224307.
  • Zhou V., Nakhleh L. Properties of metabolic graphs: biological organization or representation artifacts? // BMC Bioinformatics. 2011. V. 12, 132. DOI: 10.1186/1471-2105-12-132.
Еще