From Data to Discourse: Interpretive Shifts in the Use of Visualizations in Educational Research Articles

Tikhonova E.V.; Grigorieva M.A.; Тихонова Е.В.; Григорьева М.А.

doi:10.15507/1991-9468.030.202601.182-203

Научные статьи \ Язык. Языкознание. Лингвистика. Литература \ Общие вопросы лингвистики, литературы и филологии

From Data to Discourse: Interpretive Shifts in the Use of Visualizations in Educational Research Articles

Автор: Tikhonova E.V., Grigorieva M.A.

Журнал: Интеграция образования @edumag-mrsu

Рубрика: Академическое письмо

Статья в выпуске: 1 (122) т.30, 2026 года.

Бесплатный доступ

Introduction. Visual elements, including tables, figures, and diagrams, are a routine feature of contemporary research articles. Previous scholarship has described their formal properties and has documented the prevalence of descriptive commentary, yet the cross sectional dynamics of visual meaning within the IMRaD structure remain insufficiently specified. In particular, it is still unclear how the rhetorical and epistemic status of the same visual changes when it is taken up in Discussion after initial presentation in Results. The aim of the study is to conduct a discursive-rhetorical and epistemological analysis of the functions of data visualization in the structural components of original research articles. Materials and Methods. The study examined a purpose-built corpus of 50 peer reviewed education research articles published between 2018 and 2024. All verbal references to visual elements were manually coded using an analytic scheme covering integration type, interpretation depth index, and interpretative function. The coding also captured causal relation patterns, argumentative frames, interpretative shifts between Results and Discussion, and recurrent phrase frames. Coding reliability was assessed through double coding of a subset of the corpus and calculation of intercoder agreement. Quantitative analysis combined descriptive statistics with chi square tests of association. Results. The corpus contained 576 unique visual elements and 551 verbal references. Visuals were concentrated in Results and Methods. Referential and descriptive integration predominated, whereas interpretative integration was uncommon. Interpretation depth was heavily weighted toward the lowest levels, and higher order interpretation and theoretical generalization accounted for a small minority of cases. Argumentative framing was dominated by Support, and interpretative shifts between Results and Discussion were rare, occurring in 2.4 percent of references. Phrase frame analysis likewise showed a strong reliance on referential and descriptive formulae. Discussion and Conclusion. In education research articles, visuals function primarily as instruments of data presentation rather than as resources for interpretation and theorization. The scarcity of interpretative shifts suggests a structural constraint on the extent to which visuals are recontextualized as argumentative evidence across sections of the article. Although visuals are ubiquitous, their epistemic potential is activated only intermittently. These findings motivate more explicit pedagogical attention to visual literacy in research writing and invite editorial reflection on how multimodal evidence is expected to contribute to knowledge claims within the IMRaD genre.

Еще

Visual elements, multimodal discourse, education research articles, visual integration, interpretation depth, argumentative frames, interpretative shifts, phrase frames

Короткий адрес: https://sciup.org/147253532

IDR: 147253532 | УДК: 80:004.422.86 | DOI: 10.15507/1991-9468.030.202601.182-203

От данных к дискурсу: интерпретативные сдвиги в использовании визуальных элементов в научных статьях по педагогике

Введение. Визуальные элементы – неотъемлемая часть современных исследовательских статей. Предыдущие работы описывали их формальные свойства и фиксировали распространенность описательных комментариев, однако взаимосвязь динамики визуального значения в структуре IMRaD остается недостаточно четко определенной. В частности, остается невыясненным вопрос о том, как изменяется риторический и эпистемологический статус визуального материала при его первоначальном упоминании в «Результатах» и дальнейшей презентации в разделе «Обсуждение». Цель исследования – провести дискурсивно-риторический и эпистемологический анализ функций визуализации данных в структурных компонентах оригинальных исследовательских статей. Материалы и методы. В исследовании использовался специально созданный корпус из 50 рецензируемых статей по педагогическим исследованиям, опубликованных в период с 2018 по 2024 гг. Словесные ссылки на визуальные элементы кодировались вручную с использованием аналитической схемы, охватывающей тип интеграции, индекс глубины интерпретации и интерпретативную функцию. Дополнительное кодирование фиксировало причинно-следственные связи, аргументативные рамки, интерпретативные сдвиги между разделами «Результаты» и «Обсуждение», а также повторяющиеся речевые конструкции. Надежность разметки оценивалась посредством двойного кодирования подмножества корпуса и расчета межэкспертного согласия. Количественный анализ сочетал описательную статистику с критерием хи-квадрат для проверки ассоциаций. Результаты исследования. Выявлено 576 визуальных элементов и 551 вербальная отсылка. Большинство визуальных элементов содержится в разделах «Результаты» и «Методы». Среди типов интеграции преобладают референтные и описательные, в то время как интерпретативная интеграция встречается крайне редко. Глубина интерпретации остается недостаточной, уровни причинной интерпретации и теоретического обобщения в сумме составили около 10 % случаев. Аргументативные фреймы в основном сводятся к функции подтверждения, а интерпретативные сдвиги между результатами и обсуждением обнаружены только в 2,4 % случаев. Анализ фразеологических фреймов также показал преобладание референтных и описательных формулировок. Обсуждение и заключение. Визуальные материалы в педагогических статьях функционируют как инструменты представления данных. Недостаток интерпретационных сдвигов указывает на структурное ограничение степени реконтекстуализации визуальных материалов в качестве аргументативных доказательств в различных разделах статьи. Несмотря на повсеместное распространение визуальных материалов, их эпистемологический потенциал активируется лишь периодически. Эти выводы побуждают к более явному педагогическому вниманию к визуальной грамотности в научных статьях и предлагают редакторам задуматься о том, как мультимодальные доказательства должны способствовать утверждению знаний в рамках жанра IMRaD.

Еще

Текст научной статьи From Data to Discourse: Interpretive Shifts in the Use of Visualizations in Educational Research Articles

EDN:

Visual resources are now a standard feature of academic discourse, serving both as means of representation and as instruments of argumentation in the construction of scientific knowledge. In research articles tables, graphs, charts, and diagrams do not merely illustrate empirical findings; rather, they shape the epistemic organization of the text by condensing data and shaping how results are read [1–3]. As multimodal approaches to academic genres have gained attraction, scholars have increasingly examined how visual elements are integrated into the surrounding text of research articles. Visual elements are attributed a range of functions in research writing. They may serve as representations of the objects under study, but they also participate in how claims are framed and how readers are guided toward particular inferences1 [4]. In work on digital scholarly genres, visualization is discussed as one of the practices associated with open science, insofar as it supports wider circulation and reuse of research outputs outside narrowly specialized communities [5]. Visual abstracts, for instance, invariably attract the reader’s attention and therefore contribute to higher rates of dissemination of research results [6]. At the same time, analyses of figure and table commentary repeatedly show that authors most often restrict themselves to describing what a visual displays, whereas interpretative and theory-oriented uses are less common [7; 8]. Evidence for this tendency is particularly well documented in natural-science writing, including studies of verbal–graphical relations in abstracts [9]. These studies also suggest that the written text typically carries the main semantic load, while visuals and their verbal uptake can be integrated in several recurrent ways. What remains less clear, however, is how the rhetorical role of the same visual may shift across sections of a research article as the narrative moves from reporting results to interpreting them.

A more specific line of work addresses the classification of graphics in research articles in the exact sciences. K. Ariga and M. Tashiro [10], for example, propose a typology of graphical forms and show how such classifications can be used to describe visuals in relation to the communicative organization of the article. This classificatory perspective aligns with broader accounts in which visuals are treated as meaningful components of research writing whose contribution depends on their integration into the verbal argument and its rhetorical structure [2; 3]. At the same time, typologies and formal descriptions do not, by themselves, explain how visuals are taken up rhetorically in the unfolding text, especially when authors move from presenting results to interpreting them across article sections.

Despite the growing body of work on visual elements in academic discourse, research in this area is dispersed across several traditions that rely on partly different analytical vocabularies. Genre-based rhetorical studies tend to treat visuals as structural components of the research article that are integrated into its move structure and contribute to the organization of argumentation [4; 11]. Multimodal and semiotic approaches focus on how verbal and visual modes function as complementary meaning-making resources, including questions of intersemiosis and the allocation of communicative work across modalities2 [3]. Functional and discourse-oriented studies, in turn, describe the rhetorical roles performed by visual references, such as explanation, generalization, argumentation, and theorization, and link these roles to recurrent phraseology used to introduce and comment on tables, figures, and diagrams [7; 8]. Research on digital academic genres adds a further dimension by documenting forms of visual integration associated with online scholarly communication, where visuals may extend beyond the conventional article format and circulate more independently [5; 12].

In this body of work, visuals have been described from several complementary angles, including their formal properties and the language used to introduce them. What has received far less attention is whether the same visual may be assigned different rhetorical work in different parts of the research article. Much of the existing literature focuses either on formal properties of visuals, such as type and placement3 [4; 5], or on the phraseology used in verbal commentary that introduces and glosses tables and figures [7; 8]. By contrast, section-to-sec-tion shifts in how a visual is taken up and interpreted remain underexamined. This matters because such shifts underpin the move from presenting data to advancing interpretation in empirical writing.

This study examines the discursive-rhetorical and epistemological functions of data visualization in empirical research articles in education. The aim is to analyze how visual elements introduced in the Results section are subsequently taken up and reinterpreted in the Discussion section, with particular attention to the ways visualization serves both epistemic and rhetorical purposes in the construction of research arguments.

Two analytical perspectives inform the study. The discursive-rhetorical analysis investigates how authors use visual references as argumentative resources, examining the rhetorical strategies they deploy when drawing on visual evidence to support knowledge claims. This analysis attends to section-specific communicative conventions and how these shape the presentation and interpretation of visual material. The epistemological analysis, on the other hand, traces how visual representations participate in transforming empirical observations into theoretically grounded interpretations. It considers the processes through which data visualizations shift from serving primarily descriptive purposes in Results to becoming explanatory and inferential tools in Discussion.

The analysis is guided by four research questions:

1. What forms of structural integration of visual elements prevail in the Results section, and how are they modified in subsequent references in the Discussion section?
2. To what extent are visual data interpreted in each section, and which levels
3. What rhetorical functions do visual references perform in the Discussion section, and which interpretative strategies do authors employ when engaging with visual material?
4. In which cases do interpretative shifts occur between sections, and which factors account for differences in the degree and nature of interpretation of the same visual elements?

of interpretative transformation can be identified in the shift from presentation to interpretation?

By addressing these questions, the study identifies recurring patterns in how authors present visualized data in Results and then develop these visuals as interpretative and argumentative resources in Discussion. The analysis shows how data visualization contributes to the epistemic work of constructing warranted knowledge claims while also serving rhetorical purposes in multimodal argumentation characteristic of education research writing.

Literature Review

Types of Integration of Visual Elements. The classification of visual integration types adopted in the present study is based on the typology proposed by E. Tikhonova and D. Mezentseva [13]. The typology is not advanced as a new theoretical construct. Instead, it is presented as an adaptation and systematization of established approaches to the analysis of textual commentaries accompanying visualizations within research on academic discourse.

The theoretical grounding of this classification draws on several complementary lines of inquiry. It builds on the model of textual cohesion developed by M.A.K. Halliday and R. Hasan [14], which clarifies how linguistic reference contributes to coherence within a text. It is also informed by G. Myers’ [15] pragmatic approach, which treats visual elements as rhetorical resources in scientific argumentation rather than as neutral illustrations. In addition, it incorporates insights from the social semiotic work of G. Kress and T. van Leeuwen [16], who elaborate a multimodal grammar of visual design and foreground the meaning-making potential of visual forms. The classification is also consistent with J.L. Lemke’s account of multimodal dynamics in scientific discourse, particularly the movement from the representation of empirical content toward its conceptual interpretation4.

These approaches support a distinction between levels of textual engagement with visual elements. At the most basic level, visuals may be introduced through nominal or purely referential mentions, such as “see Table 1”, which establish formal linkage without semantic elaboration. A subsequent level involves descriptive integration, where the verbal text reproduces or paraphrases the content of the visual, for example, “Figure 2 shows the distribution”. More advanced engagement includes analytical commentary, in which differences, trends, or patterns are identified, as in “Group A consistently outperforms Group B”. At the highest level, visual references support interpretative work, where the visual element becomes a source of epistemic inference, for example, “the data indicate an effect of factor X”.

On this basis, four types of visual integration are distinguished:

– referential integration, captures formal, nominal references to visual elements without elaboration (TI–1);
– descriptive integration, reflects verbal paraphrase of visual content (TI–2);
– analytical integration involves the identification of contrasts, regularities, or relationships within the visual data (TI–3);

– interpretative integration represents the highest level of integration, where visual elements serve as a basis for explanation, inference, or conceptual claims (TI–4).

Levels of Interpretation (IDI). The four-step model of interpretation depth moves from description to analysis and interpretation, and culminates in theoretical generalization. It is used here to describe how meaning is developed around visual elements across the structure of a research article. The model rests on the view that in Results visualizations mainly serve a representational role, whereas in Discussion they may be taken up as argumentative and epistemic resources. The scale is aligned with J.L. Lemke’s concept of multimodal semiosis5 and with K. Hyland’s analysis of the rhetorical functions of data commentary in academic writing [17; 18].

At the descriptive level, the verbal reference is limited to a literal restatement of what the visual shows and does not extend to analysis or explanation. At the analytical level, the commentary begins to process the visualized data by identifying regularities, contrasts, or trends. The interpretative level introduces explanatory reasoning, linking the visual data to causal accounts and integrating them into the argument of the text. At the level of theoretical generalization, the visual supports conceptual linkage, allowing empirical observations to be embedded within broader theoretical frames.

The scale treats engagement with visual elements as a process that can develop over the course of the article. In Results, descriptive and analytical uptake is expected to predominate, given the section’s emphasis on reporting and organizing data. In Discussion, the key question is whether commentary moves upward along the scale, with visuals recontextualized through interpretation or theoretical generalization. Such movement provides an empirical basis for assessing whether visualization functions only as a means of displaying data or whether it is used as a resource for knowledge construction.

In educational research, where commentary on visuals is often restricted to description, the model helps to identify limits on interpretative engagement and to specify the conditions under which visuals operate as argumentative resources. Tracing movement across interpretative levels therefore supports a systematic analysis of how, and to what extent, visual data contribute to the epistemic work of the research article.

Causal Relations. In the present study, causal reasoning is captured through three inferential patterns that recur in academic argumentation: comparison to conclusion, correlation to conclusion, and no effect to hypothesis. The distinction is grounded in rhetorical and argumentative accounts of scientific explanation, where the acceptability of a claim depends on the nature of its warrant and on the strength of the author’s epistemic commitment [1; 19]. Framed in these terms, the three patterns allow one to describe, in a controlled and comparable way, how empirical observations are developed into explanatory statements.

Comparison to conclusion represents the least mediated form of inference. A contrast between groups, conditions, or phenomena is established, and a conclusion is drawn directly from the observed difference. Correlation to conclusion involves a more circumscribed inferential step: the author identifies a systematic association between variables and formulates a conclusion in terms of probabilistic linkage rather than causal determination. No effect to hypothesis captures a different argumentative configuration, in which a null or negligible effect is treated not as a mere absence of result but as grounds for revising expectations and articulating a new hypothesis. This pattern is analytically salient because it makes explicit a reflexive move in scientific reasoning, namely the reassessment of initial assumptions and the reorientation of the explanatory frame.

These distinctions matter for the analysis of visual elements because they help specify the functional status of figures and tables within the surrounding text. When a visual is referenced without inferential linkage, the commentary typically remains descriptive, and the visual operates primarily as a device for displaying empirical material. When a visual is embedded in causal reasoning, by contrast, it contributes to the production of claims and to the explanatory progression of the argument. The proposed typology therefore supports an empirical assessment of the extent to which visual elements participate in knowledge construction, rather than serving only as formats for presenting results.

Argumentative Frames. The present study distinguishes five argumentative frames through which visual elements can be taken up in research writing: support, explanation, conclusion, generalization, and counter-argument. This classification is informed by genre-based descriptions of how research articles organize claims and evidence, as well as by argumentation-oriented accounts of how reasons are advanced and evaluated in academic discourse [17; 20; 21]. It is also consistent with multimodal perspectives that treat visuals as meaning-making resources whose function depends on their rhetorical integration into the text [16]. The frames are used here as a pragmatic analytic device for specifying what a given visual is made to do within the local argumentative sequence.

In the support frame, a figure or table provides empirical backing for a proposition stated in the verbal text. The visual functions as evidential grounding and is invoked to strengthen the credibility of the claim. In the explanation frame, the visual is used to clarify a mechanism, process, or relationship and thus enters the argument as part of an explanatory account. In the conclusion frame, a visual becomes the basis for an inferential step, where the accompanying commentary moves beyond reporting what is shown and formulates a conclusion drawn from the visualized evidence.

The generalization frame involves a further level of rhetorical abstraction. Here a visual supports a move from particular observations to broader conceptual or theoretical claims. The counter-argument frame captures uses where visuals are mobilized to qualify, challenge, or problematize an interpretation, for example by drawing attention to inconsistencies in the data or by complicating expectations derived from prior work. In such cases, the visual is not simply confirmatory but participates in a more dialogic positioning of claims in relation to disciplinary debate.

Distinguishing these frames makes it possible to differentiate the roles that visuals play in academic discourse, from evidential support to more central functions in explanation, evaluation, and theory-oriented reasoning. The typology therefore provides a basis for assessing the extent to which visual elements are rhetorically activated within scientific argumentation rather than remaining limited to descriptive accompaniment.

Interpretative Shifts. Interpretative shifts refer to changes in the semantic and epistemic status of a visual element as it is taken up across different sections of a research article. The construct builds on a four-step progression from description to analysis and interpretation and, finally, to theoretical generalization, and it is informed by J.L. Lemke’s account of multimodal dynamics in scientific dis-course6 and by K. Hyland’s discussion of how argumentation develops in academic writing [17; 18]. From this perspective, visuals are not merely static accompaniments to verbal text; their rhetorical role may develop as the article moves from reporting findings to advancing claims.

The analysis concentrates on cases where a visual is referenced beyond the section in which it is first introduced. In Results, a figure or table typically serves to represent empirical findings. In Discussion, the same visual can be revisited to support interpretation, causal explanation, or, less often, theoretical generalization. The notion of interpretative shift captures this change in function and makes it possible to determine whether authors recontextualize visual evidence as part of higher-order reasoning, or whether they maintain the earlier descriptive or analytical role assigned to the visual.

The value of the construct lies in its ability to differentiate between visuals that remain largely illustrative and those that are mobilized as epistemic resources within the argument. When shifts are observed, visuals contribute to explanatory and theorizing work and support the development of knowledge claims. When shifts are absent, the role established in Results tends to persist, and the visual functions primarily as data display or descriptive support. In this way, interpretative shifts provide an empirical basis for assessing the extent to which visuals are integrated into the argumentative development of the research article rather than used chiefly for representational clarity.

Phrase-Frames ( Formulaic Language Patterns ) . The present study examines phrase-frames, abbreviated as p-frames, in the verbal references that accompany visual elements. This analytic focus targets recurrent lexico-syntactic patterns through which authors incorporate figures and tables into the running text of research articles. In academic writing research, phrase-frames were introduced to describe semi-fixed recurrent formulae that perform cohesive and rhetorical functions in disciplinary discourse [18; 22].

Because these patterns sit at the intersection of lexical choice and discourse organization, they provide a useful indicator of how empirical material is introduced, framed, and evaluated. E. Tikhonova and D. Mezentseva adapt the category specifically to the analysis of visual integration, treating p-frames as markers of authorial strategies for verbalizing engagement with visual data [13].

A principled classification of p-frames makes it possible to differentiate types of visual references in terms of their semantic and rhetorical load [22]. Referential constructions perform a formal act of pointing to a visual element without elaboration. Descriptive formulae reproduce what the visual displays at the level of surface representation. Analytical p-frames foreground contrasts, tendencies, or relations that can be extracted from the visualized material. Interpretative p-frames introduce inferential or explanatory moves, while theoretical p-frames explicitly link the visual evidence to broader conceptual frameworks. Counter-argumentative p-frames, finally, are used to qualify, problematize, or reassess what the visual appears to show.

The identification and quantification of p-frames are central to the study because they shift the analysis from the placement of visuals to the rhetorical character of the accompanying discourse. A higher proportion of analytical, interpretative, and theoretical p-frames suggests that visuals are being used as part of argumentation and knowledge construction. Conversely, a predominance of referential and descriptive p-frames indicates that visuals function mainly in an illustrative capacity, supporting cohesion while contributing less to epis-temic development.

Combined with the other analytical dimensions used in the study, including types of integration, interpretation depth, argumentative frames, and interpretative shifts, p-frame analysis allows a more detailed account of how visuals are linguistically activated in research writing. This layered perspective clarifies whether visuals remain primarily carriers of displayed data or become rhetorically consequential elements in the development of scientific claims.

Materials and Methods

Corpus and Selection Criteria. The empirical basis of this study is a purposefully constructed corpus of 50 peer-reviewed research articles in the field of education, published between 2018 and 2024 in journals indexed in Scopus and Web of Science. All selected journals were ranked in the first quartile (Q1) within their subject category. Journal quartile status was used as a proxy for broadly comparable editorial standards and review practices rather than as an indicator of individual article quality. Appendix 1⁷ lists all journals and articles included in the corpus, together with publication country, links, and the internal index used in the study.

Articles were selected according to the following criteria:

– the presence of at least one visual element (table, figure, chart, or diagram) in the sections Results and/or Discussion;
– a clear IMRaD structure;
– full-text availability in English;
- empirical orientation (theoretical reviews and meta-analyses were excluded).

A corpus of 50 articles was selected to allow detailed qualitative coding while maintaining coverage across journals and publication years. The criteria were intended to make the articles comparable in structure while preserving variation in disciplinary focus and methodological design. Visuals located in supplementary materials, appendices, or graphical abstracts were excluded from the analysis to ensure consistency of comparison across articles.

Unit of Analysis. The unit of analysis was defined as each verbal reference to a visual element, together with its immediate textual environment (one to two adjacent sentences). In some cases, the unit was expanded to a full paragraph or longer when necessary to preserve the coherence of the description and interpretation of the visual. This approach retains the rhetorical and argumentative function of each reference and allows its contribution to meaning-making to be examined in relation to the surrounding discourse.

Both primary mentions in Results and subsequent references in Discussion were analyzed, along with references in other sections (Introduction, Theoretical Framework, and Methods) in order to capture how visual data are integrated across the research article. This cross section perspective allows comparison of how extensively and in what ways visuals are taken up in different sections.

Each visual element was assigned a unique identifier (e.g., EDU_1.2_Fig3) to enable systematic tracking across sections. The identifier encodes the journal number, the article number within that journal, as listed in Appendix 1, and the label of the visual element.

Counting Rules and Denominators. To ensure transparency and reproducibility, the study specifies counting rules for all analytical units and statistical procedures.

A visual element was defined as a distinct graphical object embedded in the article, including tables, figures, charts, schemes, diagrams, photographs, and maps. Each visual element was counted once, regardless of how many times it was referenced in the text. On this basis, the corpus comprised 576 unique visual elements, which served as the denominator for analyses of visualization types.

A verbal reference was defined as any explicit textual reference to a visual element together with its immediate textual environment (one to two adjacent sentences), where the visual was mentioned, described, analyzed, or interpreted. Each verbal reference constituted an independent unit of analysis and could receive multiple codes (e.g., section, integration type, interpretation depth, argumentative frame). Using this definition, 551 verbal references were identified and used as the primary denominator for analyses of integration types, interpretation depth (IDI), causal relations, argumentative frames, and interpretative shifts.

For analyses of the distribution of references across article sections (Introduction, Theory, Methods, Results, Discussion), each verbal reference was assigned one or more section codes depending on its structural position. Where journals merged or hybridized sections (e. g., Results and Discussion), a single verbal reference received dual section coding. Consequently, summed section frequencies may exceed the total number of verbal references. Dual coding was retained to preserve the structural specificity of hybrid sections and to avoid imposing an arbitrary primary-section assignment that could obscure interpretative shifts.

Different denominators were used depending on the analytical focus:

1. Analyses of visualization types used the number of unique visual elements as the denominator ( n = 576).
2. Analyses of integration types, interpretation depth (IDI), causal relations, argumentative frames, and interpretative shifts used the number of verbal references as the denominator ( n = 551), unless otherwise specified.
3. Analyses of phrase-frames used the total number of 807 phrases ( n = 807), as a single verbal reference could contain more than one phrase-frame.
4. Section-based distributions report frequencies that may exceed corpus-level totals due to dual coding.

All inferential tests report the denominator ( n ) used in each analysis, and each table specifies whether counts are based on visual elements, verbal references, or phrase-frames. This reporting supports consistent interpretation of frequency distributions while preserving the granularity required for multimodal discourse analysis.

Coding Scheme. The coding system was organized at three levels: Type of Integration (TI), Interpretation Depth Index (IDI), and Interpretative Function (IF). Each reference to a visual element was coded along these three dimensions. Tables 1–3 provide the operational definitions, codes, and examples.

T a b l e 1. Types of Integration of Visual Elements (TI)

Code	Description	Example
TI–1	Formal mention without elaboration	Tables 1 and 2 showed the results of the initial model (EDU_1.3_Re-sults_Table1).
TI–2	Descriptive integration: presentation of content	To summarize, Table 5 reports a score (0–4) associated to each education region identified in the previous section. The score is a qualitative indicator … (EDU_1.4_Results_Table5).
TI–3	Analytical integration: highlighting trends, contrasts, or differences	The results of the study are represented visually in Table 11 which provides the pre-intervention percentage distribution _ For comparison, Table 12 provides the post-intervention percentages … there is a significant shift in student intention towards the ‘Very Unlikely’ option of the Likert scale for some of the writing characteristics (EDU 1.2 Discussion Table11).
TI–4	Interpretative integration: drawing implications or claims	This suggests that there may have been a knowledge gap between ethnic groups that closed, see Table 7 for details and F i g. 8c for a visualization of MLCI results by ethnicity (EDU_2.3_Results_Table7).

Source : Hereinafter in this article all tables were drawn up by the authors.

T a b l e 2. Interpretation Depth Index (IDI)

Code⁸ Level of interpretation	Example
1 2	3
IDI–0 No interpretation:	The results obtained in various cases are shown in Table 7 (EDU_1.5_
reference only IDI–1 Descriptive paraphrase	Results_Table7). Table 4 clearly describes the frequency distribution of college students’ digital learning power. Based on the analysis results, the mean value of the survey samples is divided into five levels according to the scale scoring criteria. Approximately 54% of survey samples have the basic ability… (EDU_4.5_Results_Table4).
IDI–2 Analytical observation (trend, relationship)	According to the path coefficients matrix data in Table 4, findings demonstrate that age positively influences autotelic experience ( _ = 0.240 \| p = 0.005). Degree negatively impacts unambiguous feedback ( _ = -0.183 \| p = 0.046). Gender does not show any effect on flow experience dimensions (EDU_3.3_Results_Table4).

End of table 2

1 1 2	³
IDI–3 Causal interpretation	This trend, illustrated in Figure 4, highlights the raw statistics previously discussed. The findings suggest that the symmetry created by defi-niteness/indefiniteness in the noun-adjective structures played a crucial role in facilitating accurate pronunciation of the tied /taa/ (EDU_10.3_ Discussion_Fig4).
IDI-4 Theoretical generalization or conceptual linkage	However, as shown in Figure 3 overlapping these two designs can allow for a more comprehensive research design that takes into consideration both the philosophical aspect of cause and effect with the practical problem-solving development which aligns with the digital age nowadays. Furthermore, this also shows that merging these two designs could lead to a positive synergistic effects (EDU_9.2._Re-sults_Fig3).

T a b l e 3. Interpretative Functions in Discussion (IF)

Code9

Rhetorical function

Example

IF–1 Explanation of For example, several responses explained that ML could not be used, because a specific result the data set in the given scenario was too small or too homogeneous to train a generalizable model (see Table 5, for example, quotes marked i) (EDU_2.3_ Results_Table5).

IF–2 Generalization of results

IF–3 Connection to theory or previous research (support)

IF-4 Methodological remark or limitation (counter argument)

IF–5 Conclusion

The model also hypothesizes that instructors draw on their pedagogical knowledge and PCK as they facilitate students’ work on planned instructional strategies during class (Figure 6) (EDU_6.5_Discussion_Fig6).

This aligns with the survey results which indicate that more experienced qualitative researchers (years) are less likely to believe that AI leads to predictive and generalizable theories (Q5) (EDU_9.1_Results_Fig3).

Overall, the models suggested that for individuals with lower grit, cognitive abilities, as measured by both PET and CAT scores, played a more significant role in STEM success. Conversely, as the level of grit increased, the association between cognitive abilities and STEM achievement weakened, indicating a diminished effect of cognitive abilities on STEM achievement for those with higher grit (see Fig. 1) (EDU_2.1_Results_Fig1).

Table 2 reveals several key insights. First, all but one of the reasons were provided at pre-task, with the majority corresponding to Cycle 1. This reflects the high expectations and motivation the children displayed before undertaking the task for the first time. Another interesting aspect of the data, supporting the results obtained earlier, is the emphasis on the value of peer work as a motivating force… (EDU_5.2_Results_Table2).

Thus, each unit of analysis receives a triple coding (e.g., EDU1.2_Results_ Fig3 – TI–2 – IDI–1 – IF–1), which makes it possible to trace both the structural and the interpretative characteristics of visuals across different sections of the article. It should be noted that IF–5 captures local textual conclusions derived from visual data within the immediate argumentative context, whereas IDI–4 refers to broader theoretical generalization or conceptual integration. The two categories thus operate at different levels of epistemic abstraction.

Additionally, phrase-frames (p-frames) were extracted to capture recurrent lexico-grammatical patterns accompanying references to visuals. Six categories were distinguished: referential, descriptive, analytical, interpretative, theoretical, and counter-argumentative. This level of analysis enabled identification of the linguistic repertoire employed to introduce and comment on visuals. Phrase-frames were not treated as independent analytical units but as linguistic realizations of broader integration types and interpretative functions, allowing triangulation between structural, interpretative, and lexical levels of analysis (Table 4).

Interpretative Shift Analysis. For visual elements referenced in both Results and Discussion , an interpretative shift (IS) was calculated as the difference in IDI scores between the two sections. This procedure enabled systematic identification of cases where visuals underwent deeper meaning-making (IS > 0), remained stable (IS = 0), or decreased in interpretative weight (IS < 0, rare) (Table 5).

Coding Reliability. All data were manually annotated in Excel. Verbal commentaries accompanying visual elements were assigned to the categories specified in the Materials and Methods section and coded accordingly. The coded dataset was then imported into Taguette to manage the annotations and extract frequency counts.

Appendix 210 contains the codebooks exported from Taguette, where the manual coding and subsequent analysis were conducted. The codebooks are organized by the following categories: section, type of

T a b l e 4. Phrase-frames

Language Example

Referential Tables 1 and 2 showed the results of the initial model for the final grade , Tables 3 and 4 showed the results of the initial model for the final exam score, Tables 5 and 6 showed the results of the initial model for the midterm exam score, and Tables 7 and 8 showed the results of the initial model for the quizzes (EDU_1.3_Results).

Descriptive A comparison between the pre-and post-intervention survey questions examining PU ( displayed in Table 5), PEoU (Table 6) and the impact on BI, (Table 7) indicates a change in student perception. This is clearly indicated in the heat maps presented by these tables which highlight the change in student perception of the underlying constructs of the TAM (EDU_1.2_Results).

Analytical Figure 1B also highlights significant differences across demographic groups in families’ avoidance of within-school safety dimensions. Finally, Figure 1B shows that Latine and lower achieving students are less avoidant of schools with metal detectors than Black, Asian, and White students and higher and middle-achieving students (EDU_8.2_Results_Fig1B).

Interpretative A comprehensive evaluation in Fig. 4 c reveals that the average silhouette score for each instance within its respective cluster consistently surpasses the 0.5 threshold. This achievement signifies the effectiveness of our clustering tool (EDU_1.4_Results_Fig4).

Theoretical To the best of our knowledge, no physiological models have previously been identified that justify the relationship between perceived flow and constructs related to the concept of equilibrium between task difficulty and participants’ skills, although with room for improvement in the adjustments (see Fig. 2 and 7) (EDU_3.4_Discussion).

Counter-ar- Contrary to intuitive assumptions, a negative correlation between Q2 (Al’s pattern identi- gumentative fication) and Q19 (openness to training) revealed that those confident in AI’s capabilities were less likely to pursue additional training (see Figure 2) (EDU_9.1_Results_Fig2).

T a b l e 5. Causal Connection Type

Type Example

None For a comprehensive overview of these areas, please refer to Table 2, which enumerates the assigned names for the 24 regions (EDU_1.4_Results_Table2).

Comparison - Average response time by block. The data show longer average response times for stereoConclusion type-inconsistent blocks (B6 and B7), suggesting increased cognitive effort required for these tasks compared to stereotype-consistent blocks (EDU_2.5_Results_Fig3).

Correlation – A correlation analysis was performed between the RTs of phases 3, 4, 6, and 7 to assess Conclusion consistency in participants’ performance throughout the IAT. The correlation matrix obtained is presented below (see Fig. 2). The results indicated a strong positive correlation between RTs in blocks 3 and 4, suggesting that participants who had faster (or slower) RTs in block 3 also tended to have similar RTs in block 4. This strong correlation ( r = 0.897) suggests consistency in participant performance between these blocks. The high correlation may be due to the similarity in cognitive tasks or experimental conditions between these two blocks, which reinforces the idea of stability in individual performance during these phases of the experiment. The correlations between the other pairs of blocks were somewhat weaker. Together, these correlations offer a comprehensive view of how participants’ RTs fluctuate across different phases of the IAT, revealing both consistent and divergent patterns in cognitive performance (EDU_2.5_Results_Fig2).

No Effect - The small effect size (Cohen’s d = —0.100) suggests that while men responded faster in block 7, Hypothesis the practical significance of this difference was limited (see Fig. 4) (EDU_2.5_Results).

visual element, type of integration, depth of interpretation, rhetorical function, phraseframes, type of causal relation, and interpretative shift. In total, 40 codes were used (eight analytic dimensions with five subcategories each). Each code list contains the relevant verbal contexts and, where applicable, displays the co-occurring codes assigned to the same reference.

To assess coding reliability, 20% of the corpus was independently double-coded by two trained annotators. Intercoder agreement was calculated using Cohen’s kappa, with separate estimates for the TI and IDI scales. The resulting coefficient (κ = 0.82) indicates high agreement, suggesting that the coding scheme was applied consistently. Disagreements were resolved through discussion, followed by refinement of the codebook where necessary.

Data Analysis. Quantitative analysis combined descriptive and inferential procedures. For each category, the analysis reported frequency counts, percentages, and cross-tabulations. Differences in interpretation depth (IDI) between the Results and Discussion sections were tested using the chi-square test of independence, with Cramer’s V reported as an effect-size measure. Further chi-square tests were used to examine associations between integration types, interpretative functions, and interpretative shifts. When expected cell counts were low, Fisher’s exact test was applied. Inferential tests were used to assess distributional differences and associations between categories; they were not intended to support causal claims.

Disclosure Statement. The authors used an AI-based language editing to assist with proofreading and improving the clarity and linguistic quality of the manuscript. The tool was used solely for language refinement and did not contribute to the conceptualization, analysis, or interpretation of the data. All content, arguments, and conclusions remain the responsibility of the authors.

Results

Types of Visualizations. The corpus contained 576 unique visual elements across the selected research articles (Table 6).

As the table indicates, tables and schematic formats dominated the corpus, together accounting for more than two-thirds of all visualizations. All other types were substantially less frequent.

Distribution of Visuals across Article Sections. Visual elements were not distributed evenly across article sections. Table 7 shows their distribution across the IMRaD structure. The distribution reveals a pronounced concentration in Results and Methods, which together accounted for nearly 90% of all section-assignable visuals, while the remaining three sections collectively represented just over 10% of the total.

T a b l e 6. Distribution of Visualization Types in the Corpus (based on unique visual elements, n = 576)

Type of visualization Frequency % of total

Tables 26946.7

Schemes / models 12421.5

Photos / maps / other 6611.5

Graphs 6010.4

Charts 579.9

Note : The counts are based on unique visual elements; multiple verbal references to the same visual were not considered at this stage of analysis.

T a b l e 7. Distribution of Visual Elements across Article Sections (based on unique visual elements assigned to sections, n = 562)

Section Frequency % of total

Results 29352.1

Methods 20536.5

Discussion 458.0

Theoretical Background 122.1

Introduction 71.2

Methods showed the opposite profile, favoring referential and descriptive forms. Discussion, though containing fewer references overall, exhibited a higher relative proportion of analytical and interpretative integration than other sections. Theory and Introduction showed limited engagement with visuals, restricted to referential and descriptive modes.

To examine whether integration strategies varied across article sections, a chi-square test of independence was conducted on section-coded verbal references. The analysis revealed a statistically significant association between integration type and article section, χ² (12, n = 559) = 223.26, p < 0.001, with a moderate effect size (Cramer’s V = 0.37). This result indicates that different sections of the research article exhibit systematically different profiles of visual integration rather than relying on the same integration strategies to a similar extent.

Depth of Interpretation (IDI). The analysis of interpretation depth examined how far visual elements were elaborated in the surrounding text. Interpretation depth was measured on a five-level scale, ranging from simple reference to theoretical generalization. Table 9 summarizes the overall distribution and its variation across article sections.

The distribution of interpretation depth was heavily skewed toward the lower end of the scale. Reference-only and descriptive paraphrase together comprised roughly two-thirds of all annotated references, while analytical observations accounted for less than a quarter. Causal interpretation and theoretical generalization were markedly rare, jointly representing less than one-tenth of the corpus. A chi-square goodness-of-fit test confirmed that interpretation depth was not evenly distributed across the five IDI categories, χ²(4, n = 551) = 229.30, p < 0.001.

Section-level patterns revealed distinct interpretative profiles. Results showed a balanced distribution between descriptive paraphrase and analytical observation, with only occasional movement toward causal

T a b l e 8. Integration Types by Article Section (based on verbal references, n = 551)

Section	Referential	Descriptive	Analytical	Interpretative	Total
Results	26	107	111	49	293
Methods	123	69	8	2	202
Discussion	5	17	14	9	45
Theory	4	6	2	–	12
Introduction	7	–	–	–	7
Total	165	199	135	60	559

Notes : Counts are based on verbal references to visual elements (unique references, n = 551). For section-level distributions, references in merged sections (e.g., Results and Discussion) received dual section coding; therefore the section-coded dataset contains n = 559 section-coded instances. As a consequence, section-level totals may exceed 551.

T a b l e 9. Depth of Interpretation Index across the Corpus and by Article Section

IDI level	Description	Corpus total	Results	Methods	Discussion	Theoretical Background	Introduction
0	Reference only	164	25	123	5	4	7
1	Descriptive paraphrase	198	107	70	17	6	–
2	Analytical observation	132	111	8	14	2	–
3	Causal interpretation	49	46	2	3	–	–
4	Theoretical generalization	8	3	–	6	–	–

Notes : Corpus-level dataset n = 551. Corpus totals are based on unique verbal references to visual elements. Section level counts are based on section coded references. In articles with merged sections such as Results and Discussion, individual references could be assigned to more than one section. For this reason, section level counts do not sum to the corpus total.

or theoretical interpretation. Methods, by contrast, remained almost entirely confined to the two lowest levels - a pattern consistent with the procedural orientation of that section. Discussion stood apart: despite containing fewer visual references overall, a substantially higher proportion of them involved causal reasoning or theoretical framing. This suggests that when authors revisited visuals in Discussion, they were more likely to engage with them interpreta-tively rather than descriptively. References in Theory and Introduction were sparse and typically limited to reference-only or basic descriptive forms.

Types of Causal Relations. The analysis of causal relations associated with visual references reveals a pronounced predominance of non-causal forms of commentary. Table 10 presents the distribution of causal relation types across the corpus and by section.

Non-causal references dominated the corpus, accounting for the overwhelming majority of visual uptake. Explicit causal reasoning, whether expressed through comparison-based, correlational, or hypothesis-testing frames, remained marginal. The distribution departed sharply from uniformity, χ²(3, n = 551) = 1224.76, p < 0.001, confirming the near-total predominance of non-causal commentary.

Section-level patterns reinforced this trend. Causal reasoning appeared almost exclusively in Results, with only a small number of instances in Discussion. Methods, Theory, and Introduction showed no causal engagement with visual evidence whatsoever. Even within Results, however, causal references constituted a small minority of visual uptake. This suggests that authors rarely mobilized visuals as resources for explicit causal argumentation, relying instead on descriptive or analytical modes that left causal relations implicit or unspecified.

Argumentative Frames. The analysis distinguished five argumentative frames through which visuals were taken up in the surrounding text. Table 11 presents their distribution across the corpus and by section.

The distribution of argumentative frames departed significantly from uniformity, χ²(4, n = 147) = 126.84, p < 0.001. Support clearly predominated, accounting for more than half of all coded instances. Frames involving evidential closure (Conclusion and Explanation) were moderately frequent, together comprising roughly one-quarter of the dataset. Higher-order abstractions, by contrast, were uncommon: Generalization appeared only occasionally, while Counter-argument was rare.

Section-level patterns revealed functional differentiation. Most Support, Conclusion, and Explanation frames occurred in Results, consistent with that section’s role in presenting and warranting empirical findings. Discussion, though containing fewer visual references overall, exhibited a higher relative proportion of Generalization frames. Methods, Theory, and Introduction contributed only sporadic instances of argumentative framing, almost entirely confined to Support. These patterns suggest that visuals in the corpus were deployed primarily to buttress claims already formulated in the verbal text. Truly ambitious deployments, where figures and diagrams grounded substantial theoretical inferences or were used to challenge alternative readings, appeared only sporadically.

Interpretative Shifts between Results and Discussion. The analysis of interpretative shifts examined whether a visual element was taken up at a different level of interpretation when revisited in Discussion after first being presented in Results. Only visuals referenced in both sections were included. Table 12 presents the distribution of shift types.

Interpretative shifts were uncommon, occurring in less than 3% of all verbal references to visuals. Most observed shifts reflected modest, incremental movement along the interpretation-depth scale (typically from descriptive paraphrase to analytical observation or from analytical observation to causal interpretation). Larger transitions, such as direct shifts from description to theoretical generalization, were rare. In a small number of cases, the interpretation level remained unchanged when the same visual was revisited, indicating repetition rather than development.

All documented shifts occurred within the Results–Discussion sequence. Visuals confined to Methods, Theory, or Introduction showed no evidence of interpretative recontextualization.

T a b l e 10. Causal Relation Types across the Corpus and by Article Section

Causal relation type	Corpus total (unique references, n = 547)	Results	Methods	Discussion	Theory	Introduction
None	498	244	202	40	12	7
Comparison to Conclusion	33	30	–	3	–	–
Correlation to Conclusion	15	14	–	1	–	–
No effect to Hypothesis	1	1	–	–	–	–

Notes : Corpus totals are based on unique verbal references. Section level counts are based on section coded references ( n = 551). In articles with merged sections such as Results and Discussion, individual references could be assigned to more than one section. For this reason, section level totals do not sum to the corpus total.

T a b l e 11. Argumentative Frames Associated with Visual References across the Corpus and by Article Section

Argumentative frame	Corpus total ( n = 147)	Results	Methods	Discussion	Theory	Introduction	Total by section ( n = 149)
Support	83	68	5	9	–	1	83
Conclusion	23	22	–	2	–	–	24
Explanation	20	17	–	4	–	–	21
Generalization	13	6	–	6	1	–	13
Counter-argument	8	8	–	–	–	–	8

Notes : Corpus totals are based on unique verbal references. Section level counts are based on section coded references ( n = 149). In articles with merged sections such as Results and Discussion, individual references could be assigned to more than one section. Therefore, section level totals do not sum to the corpus total.

T a b l e 12. Interpretative Shifts between Results and Discussion

Shift type Corpus frequency

Results

Discussion Methods

Theory Introduction

Stage 1 → Stage 2 5

Stage 2 → Stage 3 4

Stage 1 → Stage 4 2

Stage 1 → Stage 3 1

Stage 2 → Stage 4 1

Stage 3 → Stage 4 –

Stage 1 → Stage 1 2

Total 15

2 3 –– –

3 1 –– –

– 2 –– –

1 – –– –

– 1 –– –

– 2 –– –

6 9 –– –

Notes : Interpretative shifts were identified only for visual elements referenced in both the Results and Discussion sections. Corpus frequencies are reported relative to the total number of annotated visual references ( n = 551). Stage 1 → Stage 1 indicates repeated descriptive use without change in interpretation depth.

The overall pattern suggests that upward movement along the interpretation scale was exceptional. Even when authors returned to the same visual in Discussion, they rarely developed it beyond descriptive or analytical commentary into higher-level causal interpretation or theoretical generalization. Representative examples for each shift type are provided in Appendix 311.

Phrase-Frames in Visual References. The analysis of phrase-frames accompanying visual references reveals a strongly stratified linguistic repertoire. Table 13 presents the distribution of p-frame types across the corpus and by section.

The phraseological repertoire showed a pronounced bifurcation. Referential and descriptive frames together dominated the corpus, accounting for more than two-thirds of all instances. These constructions typically functioned as conventional pointers (“See Figure”, “As shown in Table”) or surface-level paraphrases of visual content, offering minimal interpretative elaboration. Analytical frames, though less frequent, occupied an intermediate position by introducing evaluative or comparative commentary that moved beyond mere description toward identifying patterns, contrasts, or trends in the data.

Higher-level framing strategies were rare. Interpretative, counter-argumentative, and theoretical frames collectively represented less than one-tenth of the corpus, indicating that phrase-level uptake of visuals rarely extended to causal reasoning, conceptual synthesis, or argumentative contestation. The distribution departed significantly from uniformity, χ²(5, n = 807) = 559.74, p < 0.001, confirming the marked predominance of referential and descriptive forms.

Section-level distributions revealed functional differentiation aligned with the communicative goals of each section. Methods relied almost exclusively on referential and descriptive frames, consistent with its procedural orientation. Results and Discussion, by contrast, showed greater diversity: while referential and descriptive frames remained common, these sections also concentrated most of the analytical, interpretative, and theoretical instances. Counter-argumentative frames appeared sporadically and only in sections where claims were explicitly advanced or qualified. Taken together, these patterns suggest that the linguistic scaffolding around visuals mirrored broader rhetorical constraints within the IMRaD structure, with higher-order interpretative moves largely confined to results presentation and argumentative synthesis.

Discussion and Conclusion

The findings offer an empirically grounded account of how visual elements are taken up in the argumentative development of empirical education research articles. Rather than describing visuals only in terms of frequency or format, the analysis connects their section placement with the ways they are verbally integrated, interpreted, and rhetorically framed. Bringing together distributional patterns with integration types, interpretation depth, causal reasoning, argumentative frames, interpretative shifts, and phraseological realizations makes it possible to examine not only which visuals appear in the corpus, but what work they are asked to perform in different parts of the article.

Structural Embedding of Visuals and Rhetorical Expectations. The concentration of visuals in Results and Methods is consistent with the well-documented tendency for visual material in education research to serve representational and procedural purposes more often than conceptual or theoretical ones. This distribution also aligns with genre-based descriptions of the IMRaD format, in which Results is oriented toward reporting empirical outcomes, while Discussion is conventionally reserved for interpretation and explanation [20]. Recent work on structural variation in research articles has shown that even when section ordering is modified or hybridized, consequential relations between sections remain rhetorically legible, often through lexical and discourse-level cues [23]. The present findings suggest, however, that this continuity is only weakly realized in the visual domain.

T a b l e 13. Distribution of Phrase-Frames in Visual References across the Corpus and by Article Section

Phrase-frame type	Corpus total (unique instances, n = 807)	Results	Methods	Discussion	Theory	Introduction
Referential	297	105	162	17	7	7
Descriptive	254	149	80	19	8	–
Analytical	164	142	9	16	2	–
Interpretative	61	54	2	8	–	–
Counter-argumentative	17	15	–	3	–	–
Theoretical	14	6	2	7	–	–

Note : Corpus totals reflect unique p-frame instances counted once per visual reference ( n = 807). Section level counts are based on section coded p-frame occurrences ( n = 820). In articles with merged sections, a single p-frame could be assigned to more than one section; therefore, section level totals exceed corpus level counts.

In terms of multimodal dynamics, visuals introduced in Results tended to stabilize their function at the point of first presentation, which is compatible with J.L. Lemke’s account of how meaning is distributed across semiotic modes in scientific discourse12. The limited reappearance of the same visuals in Discussion indicates that they are seldom recontextualized as evidence for higher-level interpretation, explanation, or theoretical positioning. This structural asymmetry constrains the rhetorical potential of visuals and helps explain why interpretative uptake remains limited when authors return to visual data in the Discussion section.

Integration Types and Multimodal Cohesion. The distribution of integration types in the corpus was dominated by referential and descriptive uses. This pattern points to a mode of visual use in which verbal reference primarily supports cohesion between modes rather than the development of epistemic claims. From the perspective of M.A.K. Halliday and R. Hasan’s cohesion framework [14], such references function mainly as surface ties between text and visual display: they secure local continuity, yet they rarely extend into interpretation that would activate the explanatory potential of the visual beyond the immediate co-text.

Analytical and interpretative integrations, which would allow visuals to participate more directly in argumentation and knowledge building, were comparatively infrequent. This finding is consistent with K. Hyland’s discussion of data commentary in soft disciplines, where authors often adopt a rhetorically cautious stance toward empirical material and favor descriptive exposition over stronger interpretative commitment [17; 18]. The present results suggest that a similar preference is reflected in the multimodal treatment of visual data in education research articles.

The pattern also corresponds to earlier empirical work indicating that visuals in education and related social sciences are used predominantly as illustrative rather than argumentative resources. S. Moghad-dasi et al. [4], for instance, report that visual elements are most often used to support the presentation of results, with limited engagement in epistemic or theoretical reasoning. Z. Du, F. Jiang and L. Liu [24] likewise show, in their analysis of figure legends, that descriptive lexical bundles and labeling strategies predominate while interpretative commentary remains marginal. The present study extends this line of evidence by showing that descriptive dominance is not confined to figure legends but characterizes verbal references to visuals across multiple sections of the research article.

At the same time, the corpus registered a small but statistically detectable increase in analytical integrations when visuals were reintroduced in the Discussion section. This suggests that visuals, initially embedded in a descriptive mode, may occasionally be taken up in a more argumentative way at later stages of the article. Y. Du and J. Wang similarly discuss upward trajectories in certain metadiscourse resources associated with evidential grounding and explanation across article development, including those linked to the use of visual evidence in argumentation [25].

However, this shift remained limited. J. Wu et al. show that even in applied linguistics, visual commentary relies heavily on descriptive framing [7]; against that background, the present corpus displayed an even stronger bias toward low levels of integration. On this evidence, education research articles appear to exhibit relatively conservative multimodal rhetorical practices, in which the argumentative and epistemic potential of visuals is activated only sporadically.

Interpretation Depth and Epistemic Commitment. The distribution of interpretation depth adds a further dimension to the rhetorical restraint observed in the corpus. The predominance of IDI 0 and IDI 1 indicates that most visual references remained at the level of nominal mention or descriptive paraphrase and did not develop into fuller interpretative engagement. Analytical uptake occurred with some regularity, but causal interpretation and theoretical generalization were marginal and accounted for only a small share of cases.

In substantive terms, the pattern points to a reluctance to move from displaying data to articulating what the data imply. The findings are consistent with J. Swales’ distinction between nominal reference and substantive engagement with evidence [26]. In the visual domain, authors tended to treat figures and tables as repositories of information, while the interpretative work that would turn them into sites of meaning-making was often limited or deferred. Even in Discussion, where an increase in interpretation depth would normally be expected, only a small number of visuals were taken up at higher interpretative levels.

Similar tendencies have been described in adjacent disciplinary contexts. Y. Zhang and L. Zhang, for instance, show that data commentary in economics research articles is largely descriptive or trend-focused, with relatively little movement toward conceptual generalization [8]. The present findings suggest that education research exhibits a comparable trajectory, privileging presentation over theorization when visuals are invoked.

Disciplinary contrasts become clearer when the results are set against accounts from the natural sciences. L. Unsworth reports more frequent shifts from descriptive presentation to causal explanation and theoretical abstraction, which implies a stronger integration of visuals into epistemic reasoning13. In a related vein, M. Hulst, K. Holstead, and T. Metze conceptualize visualization as a central component of interdisciplinary knowledge production, where visual representations participate in cyclical research narratives and contribute to conceptual development by linking multiple domains of meaning [27].

Against this comparative background, the present study points to a persistent mismatch between what visuals could contribute and how they are typically used in education journals. Visualizations have the capacity to support explanation and theory building, yet in the present corpus they were most often anchored at descriptive levels, which limited their role in higher-order knowledge construction within the research article.

Causal Relations and Argumentative Risk. The near absence of explicit causal relations linked to visual references provides further evidence of restrained argumentative practice in the corpus. S. Toulmin’s model of argumentation treats warrants as the mechanism that connects data to claims and gives an argument its epistemic force [19]. In the present dataset, such warranting was only rarely articulated through references to figures and tables. Visuals were more often cited without being drawn into causal reasoning, which effectively kept them in a descriptive register.

The limited occurrence of patterns such as comparison-to-conclusion and correla-tion-to-conclusion suggests that authors tended to reserve causal inference for the verbal text, while assigning visuals a largely supportive or illustrative function. This division of rhetorical labor narrows the epistemic role that visuals can play and reduces their integration into the explanatory logic of the article, a tendency that T. Miller discusses in relation to the conventions of scientific genres [1]. As a result, visuals seldom operated as the locus of causal argumentation, including in sections where explanation would normally be expected to take priority.

Argumentative Frames and Rhetorical Function. The distribution of argumentative frames helps specify the rhetorical role that visual elements assume in education research articles. The clear predominance of the Support frame shows that figures and tables were used primarily to substantiate claims already advanced in the verbal text, rather than to initiate or develop new lines of reasoning. Frames associated with explanation, generalization, theoretical linkage, or counterargumentation occurred much less frequently and remained peripheral to the overall rhetorical organization.

This limited functional range is consistent with observations reported by J. Wu et al., who note that even in sections oriented toward interpretation, visual references in applied linguistics articles rarely underpin higher level argumentative moves [7]. In the present corpus, the scarcity of theoretical phrase frames, namely constructions that explicitly connect visual evidence to conceptual or theoretical constructs, points to the same constraint. Related evidence is provided by S. Moghaddasi et al., who show that in mathematics and education research visuals tend to function as instruments of confirmation or validation rather than as sources of conceptual advancement [4].

Interpreted in light of K. Hyland’s discussion of stance and engagement in academic writing, this pattern reflects a cautious management of epistemic commitment14. When visual elements were involved, authors appeared to favor rhetorical strategies that reinforced or clarified existing claims, while avoiding more demanding uses such as critique, generalization, or theory building. In this sense, visuals entered the argument mainly as confirmatory resources, and their contribution to higher order reasoning remained limited.

Interpretative Shifts and Multimodal Development. A particularly informative result concerns the small number of interpretative shifts between Results and Discussion. Even though one might expect visuals to acquire greater interpretative depth when they are revisited in Discussion, such development occurred only in a small share of cases. In most instances, a visual remained at the same interpretative level at which it was first introduced, which suggests that its epistemic role was largely settled at an early stage of the article.

When shifts did occur, they typically involved movement from descriptive to analytical uptake or, less often, from analytical uptake to causal interpretation. Movement toward theoretical generalization was rare. The pattern therefore indicates that while some visuals gained additional interpretative weight in Discussion, this was not a regular feature of the genre and did not reflect a systematically supported progression across sections.

This tendency contrasts with evidence from other genre contexts. Y. Ma and F. Jiang show that graphical abstracts in digital research genres often foreground interpretative or theory oriented readings and thereby shape how results are understood [5]. Work in business and marketing likewise emphasizes the role of visuals in narrative and storytelling formats, where imagery may carry a semiotic load comparable to, or at times greater than, that of the verbal text [28]. These comparisons point to genre and communicative context as major constraints on how far visuals can be developed as interpretative resources.

In conventional empirical education articles, by contrast, upward interpretative trajectories remained weak. Visual elements were rarely recontextualized in ways that materially changed their interpretative status across sections, which limited their contribution to multimodal knowledge construction.

Phrase-Frames and Routinized Multimodal Discourse. Phrase frame analysis provides a linguistic perspective on the patterns observed in causal relations, argumentative frames, and interpretative shifts. The predominance of referential and descriptive phrase frames points to a routinized repertoire of visual reference, in line with V. Cortes’ account of formulaic patterning in academic discourse [22]. These constructions support cohesion and guide the reader through the text, but they rarely invite sustained engagement with the evidential or interpretative implications of the visual.

The low frequency of interpretative, theoretical, and counter-argumentative phrase frames suggests that the linguistic resources needed to activate visuals as epistemic tools were used sparingly. K. Hyland emphasizes that phraseology is not merely stylistic but functions as a mechanism through which disciplinary discourse signals stance and calibrates interpretative commitment [17]. In alignment with this statement, phrase frames have been analyzed and systematized by certain scientists [29; 30]. Semantic regularities in the use of verbal patterns for data commentary are identified in economics research articles, showing how they take part in explaining abstract economic concepts. The variation of verbal strategies within rhetorical moves in the texts of different articles is noticeable, with function-word and research-oriented types of moves prevailing.

In the present corpus, phraseological choices consistently kept visual references within a descriptive horizon and limited their capacity to support more ambitious argumentative work.

Considered together, the distributions of causal patterns, argumentative frames, interpretative shifts, and phrase frames indicate that the restricted epistemic role of visuals in education research articles cannot be explained as a random feature of individual writing choices. It reflects the combined influence of section conventions, a preference for rhetorically cautious uptake of empirical material, and routinized linguistic practices that, in combination, constrain the integration of visuals into higher order scientific reasoning.

The analysis demonstrates that, despite their widespread presence, visual elements in empirical education research articles are most often integrated at referential and descriptive levels. Figures and tables are routinely used to present results, but they only rarely contribute to causal reasoning, theoretical generalization, or sustained argumentative development. Across the IMRaD structure, visual elements tend to retain the interpretative status assigned at first mention, and shifts in interpretation between Results and Discussion occur only exceptionally. This pattern points to a persistent gap between the representational potential of visualization and its realized epistemic function in education research writing.

The main contribution of the study is both empirical and methodological. By combining several analytical dimensions, including structural positioning, depth of interpretation, rhetorical function, and recurrent phraseological framing, the study provides systematic evidence of how visual elements are rhetorically constrained in leading education journals. These findings offer a grounded basis for pedagogical work on visual literacy in academic writing and for editorial reflection on current conventions of multimodal reporting.

Several limitations should be acknowledged. The corpus was restricted to empirical research articles in education, and the analysis focused on verbal references to visuals rather than on visual design features themselves. As a result, the study does not address how graphical form, layout, or visual complexity interact with verbal commentary. These limitations also point to directions for further research.

Future work should test the proposed framework in cross disciplinary contexts in order to assess the extent to which the observed patterns are specific to education or reflect more general tendencies in the social sciences. The scope of analysis should also be extended beyond standard empirical articles to genres that foreground multimodality, such as graphical abstracts, supplementary materials, digital posters, and preprints, to examine whether interpretative work is displaced into peripheral formats. In addition, integrating analyses of visual design with analyses of verbal framing would allow a more complete account of how the two channels jointly support causal claims and theoretical generalization.

Further methodological triangulation would also be valuable. Corpus based coding could be complemented by reader response studies, think aloud protocols, and interviews with authors in order to clarify how visuals are interpreted, trusted, and used for inference. Longitudinal analyses could examine whether practices of visual argumentation change over time under the influence of open science norms, reporting guidelines, and AI assisted figure production. Finally, studies of editorial and reviewer practices could help identify institutional factors that shape expectations for visual argumentation and inform recommendations aimed at strengthening the epistemic role of visual elements in research articles.