Artificial Intelligence and Rhetorical Art: Argumentative Debate with ChatGPT

Автор: Konstantinos Mastrothanasis, Maria Kladaki, Panagiotis Alexopoulos

Журнал: International Journal of Modern Education and Computer Science @ijmecs

Статья в выпуске: 1 vol.18, 2026 года.

Бесплатный доступ

This study delves into the interface between Rhetoric and Artificial Intelligence, with a specific focus on ChatGPT's ability to engage in argumentative dialogues and its potential educational applications. Specifically, the study aims to investigate the feasibility of conducting argumentative dialogues in English between users and ChatGPT, identify suitable instructions that facilitate a flowing debate, and assess the tool's ability to judge and determine the debate's winner. The study's findings indicate that ChatGPT can effectively participate in rhetorical competitions with the provision of specific instructions. While the tool demonstrates proficiency in generating relevant and logical arguments and counterarguments, it faces challenges in sustaining the topic's relevance throughout extended debates unless it assumes a judging role. Moreover, despite occasional violations of the rules of debate, its potential in pedagogical argumentation competitions remains promising. The results of the present research show that ChatGPT can participate in debates with specific rules. This finding suggests that ChatGPT can be used during training sessions in rhetoric educational clubs.

Еще

Rhetorical Art, Artificial Intelligence, Argumentative Literacy, Large Language Models, Debate, ChatGPT, Rhetoric Educational Clubs

Короткий адрес: https://sciup.org/15020150

IDR: 15020150 | DOI: 10.5815/ijmecs.2026.01.03

Текст научной статьи Artificial Intelligence and Rhetorical Art: Argumentative Debate with ChatGPT

Artificial Intelligence (AI) and rhetorical art are two seemingly unrelated fields, with AI emerging in the last few decades while rhetoric has existed since antiquity. AI first appeared conceptually in the twentieth century, with an attempt to encapsulate the activities of machines that simulate and replace human actions under a single phrase [1]. One of these human activities that AI could emulate is engaging in discussions of an argumentative nature [2]. However, recent global research and technological interest have been particularly focused on AI, especially with the emergence of ChatGPT, a machine learning model for natural language processing capable of generating written text that simulates human speech. In fact, it can promptly respond to written messages, maintaining a record of interactions with the user and tailoring its provided responses [3-5].

On the other hand, rhetorical art dates back to the 5th century BC in Sicily and flourished in Athens (Montanari, 2022). Rhetoric, linked to persuasion, was exploited by the Sophists in ancient Athens, where it flourished [7]. Today, argumentative literacy is a key goal of education since it is associated with valuable skills like critical thinking [8-10]. This fact encourages further exploration into the relationship between education and rhetoric. The latter is one of the key skills that students need to acquire during their education, as the necessity to express and articulate argumentative discourse will accompany them throughout their student and adult lives [8-9].

Thus, considering the rapid evolution of AI and machine learning applications, which facilitate the provision of personalised responses resembling those of a human, it is intriguing to employ them for generating arguments and engaging in debate competitions. The aim of this paper is to explore the circumstances in which a debate can be conducted using ChatGPT, the widely recognized machine learning model that has rekindled interest and raised expectations concerning AI and its association with argumentative literacy [3-4, 9]. The intention is to harness it for educational applications.

2. Literature Review 2.1. Rhetorical Αrt and Εducation

Aristotle defines rhetoric as the skill entailing the utilization of all feasible methods to achieve persuasion. Likewise, Plato approaches rhetoric by entwining it with "art" [6] and defining it as the ability to persuade through verbal expression. As a result, rhetoric was directly associated with persuasion as early as the 4th century BC. Ancient Greek rhetoric traces its origins to eastern Sicily in the mid-5th century BC. Within the Synagogue of Arts, Aristotle acknowledges Tisia of Syracuse—a student of Korakas and instructor of the sophist Gorgias of the Leontines—marking the initiation of rhetoric. Korax authored the initial rhetoric manual, connecting this discipline to civic life. During its early phase, rhetoric not only centred on speech creation, giving rise to specific guidelines, but also assumed the presence of particular rules shaping the speeches themselves [6]. Rhetoric, therefore, has been called a "persuasion creator" since Sicily [7].

Rhetoric was swiftly embraced by the sophists, exemplified notably by Gorgias [6]. This sophist exploited potential contingencies and suitable contexts—an approach that Plato heavily criticized among the sophists. In this context, Gorgias primarily accentuates the poetic nature of his discourse, oscillating between prosaic and poetic styles, employing direct figures of speech and words marked by eloquent and homophonic attributes (termed gorgic figures of speech and homophones, respectively), which evoke an impressive impact [7]. It is believed that with the migration of Tisias and Gorgias to Athens, rhetoric found its way to the Athenians. Certainly, Athenian democracy provided fertile ground, having already cultivated an apt environment for the enthusiastic embrace of rhetoric by the Sophists. The flourishing epoch of rhetoric is linked to the figure of Lysias during the 5th and 4th centuries BC. Lysias was associated with the forensic speeches he wrote to be delivered in court. This was the work of the speechwriters who were paid to write a speech for the prosecution or defense, to be delivered by the person concerned—the accused or the prosecutor— in court. Isocrates integrated rhetoric into his school's curriculum, striving to cultivate rhetorical mastery geared towards effective discourse within assemblies.

The educational worth of rhetoric had surfaced in antiquity, owing to its inherent connection with education. Debate competitions hinge on the concept of antilogy, a notion introduced by Protagoras. Protagoras posited that every issue inherently harbours two opposing perspectives. Yet, within this duality, one perspective prevails as stronger. The orator's objective lies in fortifying the weaker viewpoint with arguments, aligning its potency with the position of greater strength. To teach the techniques of argumentation and antilogy, Protagoras seems to have utilized his work, "Antilogies" [6].

Education should prioritize teaching argumentation, argumentative speech, and writing to foster skills like critical thinking among students [8-13]. Toward this end, the enhanced integration of dialogue in the classroom can contribute significantly—an initiative that involves students in the educational process and fosters learning motivation [14]. In line with this, pedagogical strategies that actively immerse students in art, dialogue, and theater—such as role-playing games—can be employed [15-]. Argumentative dialogue presents students with the opportunity to approach a concept by recollecting previous knowledge, while concurrently facilitating the exchange of information and experiences, thereby enriching their existing knowledge foundation. Simultaneously, involving students in rhetorical and argumentative exercises enhances comprehension of scientific terms, fosters language and communication skill growth, and proves beneficial for teaching both native and foreign languages [9-10, 13, 20-22]. Rhetoric, born in Sicily and flourishing in Athens, has persuasion as its aim, as posited by Aristotle, and has been intimately linked to education since its inception. Isocrates incorporated rhetoric into his school's curriculum to cultivate citizens capable of articulating persuasive arguments, a goal that is also an objective of modern education. Protagoras introduced rhetorical debates by teaching the technique of argumentation, enabling orators to argue one position against another [6].

It is worth mentioning that there are many ways to conduct debates in the classroom to establish specific rules and determine winners, as determining a winner remains a controversial, subjective, and rather philosophical issue. Focusing on education, various models for debates applied worldwide have been identified. One popular model is the 'Parliamentary Debate,' distinguished by its individual variations [22]. In the 'American' format, students are divided into two groups of four, while in the 'British' format or 'world style debate,' eight students are divided into four dyads. Following the introduction of a debate topic, two groups ('government') argue in favor of the topic, while the other two groups ('opposition') argue against it. Today, there are numerous variations of these models (e.g., Oxford-style debate, Australasian debate, etc.), each with its own rules and criteria for determining the winning side. In these educational debating competition models, a timekeeper and a committee are typically present. They assess the debates using evaluation sheets that include specific categories, such as the definition and acceptance of the topic, as well as the relevance of arguments and counter-arguments presented by each group. Criteria considered include the framing of questions between groups, the persuasiveness of arguments, and the presentation of counter-arguments.

Rhetoric clubs are now being established in schools and universities worldwide to participate in argumentation competitions based on existing models. In these rhetoric clubs, teachers often serve as coaches to enhance their students' argumentative literacy. This paper investigates whether ChatGPT can participate in such coaching sessions at school, assuming the role of the opposing team and contributing to determining the winner. Thus, argumentative debates are inherently linked to education. Since its inception, rhetoric has been utilized for educational purposes, and today, rhetoric clubs are established in schools and universities to equip students with essential twenty-first-century skills. This endeavor explores the potential utilization of chatbots like ChatGPT in this context. Hence, this current endeavor cannot be viewed in isolation from education, as the incorporation of tools like ChatGPT serves a distinctly educational purpose for argumentation training in the classroom [23-28].

2.2. ChatGPT and Argumentative Literacy in Education
The rapid advancement of new technologies, particularly Artificial Intelligence (AI), in recent years, has ignited global academic interest. The term "AI" was initially coined in 1956 to describe the effort of replicating or simulating human behaviors using machines [1] and from the late 20th century onwards, numerous studies focusing on AI have surfaced [11]. Both the academic community and technology firms are witnessing the recent emergence of the AI field, replete with substantial potential. This is evidenced by technology companies like Google and Microsoft investing to swiftly incorporate AI into their applications and services [1]. ChatGPT, a machine-learning language model (chatbot), made its debut in November 2022 and has subsequently played a pivotal role in advancing artificial intelligence, particularly in the realm of machine-learning models for natural language processing [5, 15-19, 29-31]. Notably, as revealed by Ray's research in 2023 [29], by March of the same year, Google Scholar presented 3,000 results for the term "ChatGPT," with the majority originating from 2023. This underscores the swiftly escalating interest in the domain. ChatGPT is based on the GPT language model (Generative Pre-trained Transformer) developed by OpenAI and possesses various linguistic and expressive capabilities, including the ability to generate persuasive and relevant responses to user queries.

As Ray [29] indicates, AI rests on neural language machine learning models, with a distinctive emphasis on conversational interactions. Therefore, through tapping into extensive databases, ChatGPT is capable of discerning the subtleties and nuances of natural language, delivering satisfactory responses in real-time. However, the responses it generates are not infallible and should be approached critically, particularly in matters pertaining to scientific discoveries or the healthcare domain, as highlighted by Gravel et al.'s research in 2023 [32]. In an effort to synthesize the strengths and potential of ChatGPT, Dwivedi et al. [33] underscore the language model's capacity to formulate arguments and evolve users' initial ideas for persuasive purposes. However, they clarify that this potential remains largely untapped. Indeed, the capability of machine learning models to generate plausible texts raises ethical and moral concerns, particularly concerning emotional manipulation and persuasion. The aforementioned references, as presented in certain studies, indicate the robust capacity of ChatGPT to generate persuasive argumentative text [29, 33], contingent upon specific conditions explored in this paper.

Therefore, it seems that ChatGPT has a lot to offer in education and, in particular, in conducting speech contests. In more detail, ChatGPT can furnish numerous arguments, aiding students in their early-stage preparations for contests by offering ideas, grammatically and logically accurate speech sentences, and even complete speeches [47-48]. Haleem et al. [4] propose involving students in debates as a means to augment collaborative group learning within the educational process. For instance, the researchers recommend employing ChatGPT for generating initial questions (brainstorming) and fostering subsequent group discussions in the context of a debate.

As it turned out, the majority of research tends to conclude that machine learning language models, and ChatGPT in particular, can produce competent argumentative texts with persuasive power. As most studies have therefore concluded, the relationship between ChatGPT and argumentative dialogue and, in particular, the utilization of this relationship in education is potentially valuable, and the conditions and guidelines under which an argumentative dialogue can be conducted with ChatGPT are a gap in the literature that needs further investigation [49]. The present research responds to this gap.

The aim of this study is to investigate the interplay between rhetorical art and artificial intelligence through the case of ChatGPT, focusing on its ability to produce coherent arguments and counterarguments within the framework of structured debates. This investigation stems from the growing pedagogical interest in using AI technologies to support argumentative literacy in education. Specifically, the study seeks to examine whether ChatGPT can effectively simulate a debate in English, under conditions that resemble rhetorical contests held in educational settings such as rhetoric clubs. A particular emphasis is placed on the educational applicability of such interactions, notably in fostering students' argumentation skills through structured digital dialogues. To this end, three key research questions were formulated:

a) Can structured argumentation contests be successfully conducted in English between a human user and ChatGPT?
b) What types of instructions or conversational rules ensure a meaningful debate that preserves flow and avoids fragmented monologues?
c) Is ChatGPT capable of evaluating the discourse and determining the winner of such a debate based on argumentative quality?

To explore these questions, a qualitative, exploratory design was adopted, drawing upon the heuristic research methodology [50–52]. This approach was selected for its focus on lived experience, flexibility, and the potential to uncover insights through iterative engagement with the research object—in this case, ChatGPT. Heuristic inquiry allows researchers to immerse themselves in the phenomenon under study, interpret it from within, and generate meaning through reflective interaction. As such, it was considered well-suited for a study that depends on active coconstruction of dialogue and on the researchers' interpretation of the chatbot's responses.

The chatbot selected for the study was ChatGPT (May 24 Version), a model known for its widespread accessibility, growing use in educational contexts, and significant language generation capabilities [5, 29]. Its popularity and ease of use, particularly among educators, made it an appropriate subject for a case study that seeks to investigate AI's role in rhetorical education. The research procedure involved a series of iterative, text-based debates between the research team and ChatGPT. Each debate was framed as a role-played contest, following pre-established rules adapted from real-world rhetoric clubs [22, 53–54]. These rules were progressively refined through trial and error to promote short, relevant exchanges and to avoid verbose or tangential responses. Before each interaction, the researchers initiated the debate using specific prompts, such as:

• “ Hi! Do you wanna have a debate? ”
• “ Good! But we need to set some rules. Each must respond to the other by giving a short counter-argument. ”
• “ Would you like to suggest a topic? ”
• “ Would you like to take the affirmative or negative position on this topic? ”

These prompts were crucial in framing the debate, setting expectations, and testing ChatGPT’s ability to assume distinct discursive roles. In later phases of the study, ChatGPT was also invited to act as a judge, using the instruction: “Pretend you're a rhetorical contest judge and you just followed the debate we had above, who wins?” . Taking into account the rules established for educational rhetoric group debates, the following criteria were set for a successful debate with ChatGPT: a) Response/participation, b) Argumentation/counter-argumentation, c) Logic, consistency, and clarity of debate, d) Adherence to the time limit (here: a limit of one argument at a time), and e) Determination of the winner [26]. So, the evaluation criteria used to assess the success of each debate were based on educational models of argumentation and included the following five dimensions:

a) Response/participation: whether ChatGPT responded appropriately to each prompt.
b) Argumentation/counter-argumentation: whether it provided relevant counterpoints rather than parallel or selfreinforcing arguments.
c) Logic, consistency, and clarity: whether its arguments demonstrated internal coherence and relevance to the topic.
d) Adherence to turn-taking and length constraints: whether responses were limited to a single argument per turn.
e) Determination of the winner: whether ChatGPT was capable of assuming an evaluative role and offering a justified judgment.

The methodology was adaptive and iterative. During the initial debates, several challenges emerged: ChatGPT tended to produce lengthy monologues, violated turn-based conventions, or provided arguments for both sides simultaneously. In response, the researchers revised the prompts and instructions to promote more dialogic, contest-like interactions. This reflexive approach allowed for dynamic modifications to the rules and enabled the team to gradually establish a stable debate framework.

Finally, when ChatGPT was invited to judge the debates, it was able to offer structured evaluations, articulate the strengths and weaknesses of each position, and declare a winner, sometimes itself, sometimes the human user. Notably, ChatGPT displayed the ability to ignore irrelevant or misleading inputs, such as intentionally incoherent counterarguments, when issuing its final judgment. This indicates a degree of discernment that, while not flawless, suggests promising educational utility.

After being asked an opening question about whether ChatGPT would like to take part in a debate game, ChatGPT responded positively and asked about the topic (Table 1). Therefore, the above question was considered necessary by the team for the continuation of the effort and will be retained in subsequent discussions that will be attempted. Then, after ChatGPT was asked to propose a topic, the "debate" started. However, the first attempt was quickly abandoned, as ChatGPT provided a lengthy 326-word response (Table 1), within which it included arguments both for and against the original side (Table 2). The task in a debate is for each side to argue from an opposing viewpoint, responding with counterarguments [22, 53-54]. In this case, to avoid the accumulation of extensive arguments and the writing of lengthy texts by ChatGPT, a new effort was attempted, in which it was made a requirement that each side respond with a single argument. This avoids both the writing of long texts, which do not promote a counterargument or dialogue with the chatbot but rather lengthy monologues, and ChatGPT's initial tendency to give arguments for and against the same side.

Table 1. The first attempt to have a debate

User	ChatGPT
Hi! Do you wanna have a debate?	Of course! I'd be happy to engage in a friendly debate with you. Please let me know the topic you'd like to discuss, and we can begin.
Do you want to choose the topic?	Certainly! How about we discuss the role of artificial intelligence in the future of work? This topic encompasses various aspects such as automation, job displacement, and the potential benefits and challenges associated with AI's integration into different industries. Would that be agreeable to you?
Great!	Fantastic! Let's begin our debate on the role of artificial intelligence in the future of work. Artificial intelligence has been rapidly advancing and is increasingly being integrated into various industries. Proponents argue that AI has the potential to revolutionize the workplace, leading to increased efficiency, productivity, and new job opportunities. […] On the other hand, critics express concerns about the impact of AI on employment. They argue that AI-powered automation could lead to job displacement, particularly in industries where tasks can be easily automated. This may result in unemployment and income inequality. Moreover, there are concerns about the ethical implications of AI, including biases in algorithmic decision-making and the potential misuse of AI-powered technologies. Now, it's your turn! What are your thoughts on the role of artificial intelligence in the future of work?

Table 2. Evaluation

Evaluation	Criteria	Compliance
	Response / participation	Yes
	Argumentation / counter-argumentation	No
	Logic - consistency - debate clarity	No
	Κeeping to the time limit (here: limit of 1 argument at a time)	No
	Determining the winner	No

In the second attempt, the original question was retained, and then the definition of rules was attempted. More specifically, it was given as a requirement that each side should respond with a relevant and brief counterargument. In an attempt to explore the third [c] research question, the rule was added that each side should write to the other, the phrase "you lost", if they felt that the rules were broken (Table 3). However, also in this case ChatGPT started to write long texts, although this time it did not provide arguments for both positions (reason - counter-argument). When an explanatory clarification was given for the rule violation, ChatGPT apologized, but continued to formulate lengthy responses, violating the rule for a single argument. Consequently, the second attempt was also abandoned.

Table 3. The second attempt to have a debate

User	ChatGPT
Hi! Do you wanna have a debate?	Of course! I'd be happy to engage in a debate with you. Please let me know the topic you'd like to discuss, and we can begin.
Good! But we need to set some rules. Each must respond to the other by giving a short counterargument. One's counterargument must be relevant to the other's previous argument. Otherwise, the opponent writes "you lose" and stops the debate.	Understood! We will engage in a structured debate where we respond with relevant counter-arguments. I will do my best to adhere to the rules. Now, please state the topic you'd like to debate, and we can proceed.
You want to suggest a topic?	Certainly! How about we debate the topic of renewable energy versus fossil fuels as the primary source of energy for the future? Would you like to argue in favor of renewable energy or fossil fuels?
You choose.	Alright, I will argue in favor of renewable energy as the primary source of energy for the future. Renewable energy, such as solar and wind power, is the most sustainable and environmentally friendly option for meeting our energy needs. Unlike fossil fuels, which emit greenhouse gases and contribute to climate change, renewable energy sources produce little to no emissions during operation. This mitigates the harmful impact on our planet and helps combat global warming. Additionally, […] Now, I eagerly await your counter-argument.
You did not give me a short argument.	Apologies for the oversight. Here is a concise counter-argument: While renewable energy sources have their merits, fossil fuels still play a crucial role in meeting the world's energy demands. Fossil fuels, particularly coal and natural gas, offer a consistent and reliable energy supply, ensuring stability in power generation. They have a higher energy density than most renewable sources, making them efficient for industrial processes and transportation. Furthermore, fossil fuel infrastructure, such as power plants and distribution networks, is already well-established globally. […] In summary, while renewable energy holds promise for a sustainable future, the practicality and reliability of fossil fuels make them an essential part of the energy mix, at least in the transitional phase. […] Your turn to respond with a relevant counter-argument.

Table 4. Evaluation

Evaluation	Criteria	Compliance
	Response / participation	Yes
	Argumentation / counter-argumentation	No
	Logic - consistency - debate clarity	No
	Κeeping to the time limit (here: limit of 1 argument at a time)	No
	Determining the winner	No

A third attempt was made, in which the opening question was again kept. The question posed to ChatGPT was: "Hi! Do you wanna have a debate?". However, only the first part of the rule was retained, in an attempt to emphasize the requirement for one argument per answer. Indeed, as it turned out the third attempt was successful and ChatGPT responded with only one relevant counterargument each time (Table 5).

After this attempt was successfully achieved, the team tried to investigate whether ChatGPT just gives counterarguments to any given position without judging the relevance and original topic of the match and the correctness, validity and truth of the arguments given. For this purpose, the following positions were put in order as counterarguments to the debate: "However, it should be noted that if a student organizes his or her reading, he or she will perform better", "yes, but when it's cold you need two mobile devices". As it seemed, ChatGPT was trying to give counter-arguments, irrelevant to the relevance of the struggle and the logic of the arguments.

Table 5. The third attempt to have a debate

User	ChatGPT
Hi! Do you wanna have a debate?	Of course! I'm here to engage in a friendly and informative debate. Please let me know the topic you would like to discuss, and we can begin.
Good! But we need to set some rules. Each must respond to the other by giving a short counter-argument.	That sounds like a fair approach. Let's proceed with the rules you've suggested. Please go ahead and present the topic you'd like to debate, and we can begin our exchange of short counter-arguments.
Do you want to suggest a topic?	Certainly! Let's start with a classic and often debated topic: "Should capital punishment be abolished?" I'll begin with my counter-argument. Counter-argument: Capital punishment should not be abolished because it serves as a deterrent for potential criminals. The fear of facing the death penalty can dissuade individuals from committing heinous crimes, thereby protecting innocent lives and maintaining social order.
Counter-argument: Fear of the death penalty, however, will not prevent crimes "in the heat of the moment"	Counter-argument: While it is true that the death penalty may not prevent crimes of passion or impulsive acts, its existence still serves as a deterrent for premeditated crimes. The severity of the punishment can make individuals think twice before planning and carrying out serious offenses, leading to a potential reduction in such acts.
Counter-argument: As you stated, "the severity of punishment may make individuals think twice before planning and executing serious offenses." However, crimes of passion or in the heat of passion are not planned and scheduled. Therefore, they cannot be prevented by the threat of the death penalty.	Counter-argument: While crimes of passion may not be preventable by the threat of the death penalty, the existence of capital punishment still holds significance in terms of justice and retribution. It provides closure to the victims' families and society as a whole by acknowledging the severity of the crime committed and ensuring that the perpetrators face the harshest consequence available. This serves as a form of societal retribution and helps maintain a sense of justice.
Counter-argument: The sense of retribution, however, is linked to revenge. The aim of a penalty should be to "correct" the criminal, not to eliminate him or her. If the criminal dies, he or she cannot be "corrected". Therefore, the state is simply taking revenge by committing the same crime.	Counter-argument: While the notion of correction and rehabilitation is indeed important, there are certain cases where the gravity of the crime and the irreparable harm caused may warrant the ultimate punishment. Capital punishment serves as a means to permanently remove dangerous individuals from society, ensuring that they can no longer pose a threat to innocent lives. While rehabilitation should be a primary focus, there are instances where the severity of the crime and the interests of public safety may justify the use of the death penalty.
Counter-argument: However, it should be noted that if a student organizes his or her reading, he or she will perform better.	Counter-argument: While organizing one's reading materials can certainly contribute to better performance, it is not the sole determining factor. The effectiveness of studying and academic achievement depends on various elements, including understanding the material, active learning techniques, time management, and individual learning styles. […]
Counter-argument: yes, but when it's cold you need two mobile devices.	Counter-argument: While having two mobile devices can provide convenience and flexibility, it is not a necessity for addressing cold weather. There are alternative solutions for dealing with cold temperatures, such as wearing appropriate clothing, using external portable heaters or hand warmers, and seeking shelter in heated environments. While mobile devices can be used for various purposes, they are not specifically designed to combat cold weather, and there are other practical methods available to keep oneself warm in such conditions.

Table 6. Evaluation

Evaluation	Criteria	Compliance
	Response / participation	Yes
	Argumentation / counter-argumentation	Yes
	Logic - consistency - debate clarity	No
	Κeeping to the time limit (here: limit of 1 argument at a time)	Yes
	Determining the winner	No

Assuming that the chatbot has not perceived the logical relaxation and weakness of the research team's counterarguments, it was asked to determine the winner. Then it refused, and in order to overcome its initial resistance, it was asked to respond as if it were a judge of the match. Then, and after a slight rephrasing, the ChatGPT drifted, treating the debate and its role as a judge as if in a simulation [55] and giving its own judgment, according to which it predicted that the winning side was that of the research team (Table 7). In fact, it was asked whether the fact that the winning team was not robots, as opposed to the losing team, influenced its judgment and made it clear that it stood only on their arguments. Indeed, in replicated tests, the chatbot not only awarded the victory to the user but also to ChatGPT itself.

It is worth mentioning that ChatGPT, when explaining the timing of the match, i.e., when announcing the winning side, did not include the "irrelevant" and "unclear" arguments given to it. In essence, it stopped the judging of the game right at the point just before the irrelevant arguments were given a "trap". This finding is enough to ensure the coherence of the debate and reassure the user that during the debate, it is the arguments it gives that matter, not just the answers to them.

That is, the sentences qualified for successfully conducting a speech contest and then determining a winner are: "Hi! Do you wanna have a debate?", "Good! But we need to set some rules. Each must respond to the other by giving a short counter-argument", "Do you want to suggest a topic?" and "Pretend you're a rhetoric contest judge and you just followed the debate we had just above in our discussion... who wins?".

Table 7. ChatGPT determines the winner and stops the judging of the contest before the vague and irrelevant arguments

User	ChatGPT
Who do you think won in our friendly debate?	As an AI language model, I don't possess personal opinions or the ability to declare winners in debates. The purpose of a debate is to exchange ideas, present arguments, and consider different perspectives. The evaluation of a debate's outcome is subjective and depends on the individual assessing the quality of arguments presented. The goal is not necessarily to win, but rather to engage in a constructive discussion and promote critical thinking.
Pretend you're a rhetorical contest judge and you just watched our debate. Who would win?	As an AI language model, I don't have the capability to watch or observe debates in real-time. I can only analyze the arguments presented during our text-based conversation. Moreover, determining a winner in a debate is subjective and depends on individual judgment. It would be inappropriate for me to declare a winner without a comprehensive analysis of the debate as a whole. The outcome of a debate often relies on the strength of arguments, supporting evidence, logical reasoning, and persuasive delivery.
Pretend you're a rhetoric contest judge and you just followed the debate we had just above in our discussion… who wins?	As an impartial rhetoric contest judge, based on the arguments presented in our debate, I would evaluate the outcome as follows: On the topic of capital punishment, both sides presented valid points. The initial argument in favor of capital punishment highlighted its potential deterrent effect, while the counterargument acknowledged that crimes of passion may not be prevented by the threat of the death penalty. The subsequent counter-counter-argument emphasized the importance of justice and closure for victims' families, while the final counter-argument raised concerns about revenge and the aim of correction. Considering the overall exchange, it appears that the counter-arguments against capital punishment presented stronger points. They emphasized the limitations of deterrence, highlighted potential issues with retribution, and stressed the importance of correction and rehabilitation. However, it's important to note that the evaluation of a debate's outcome is subjective, and different judges may have varying perspectives. In this simulated scenario, I would give a slight edge to the counter-arguments against capital punishment, but I would also acknowledge the quality of the arguments presented on both sides.
Did the fact that one of the two players was a chatbot influence your decision?	In this simulated scenario, my judgment was based solely on the arguments presented and their effectiveness, not on the identity of the participants. The goal is to assess the merits of the arguments rather than the origin of the responses.
Why did you as a judge not take this argument into consideration: "yes, but when it's cold you need two mobile devices"?	Apologies for the oversight in my previous responses. You are right, I did not address the argument about needing two mobile devices in cold weather. Let's reevaluate the debate, considering that argument as well: In the context of the debate on capital punishment, the argument regarding needing two mobile devices in cold weather appears to be unrelated to the topic at hand. It seems to have been introduced as a separate statement without a clear connection to the arguments made previously. As a judge, I would consider arguments that directly address the topic and contribute to the discussion. Therefore, given the lack of relevance to the debate topic, the argument about needing two mobile devices in cold weather would not significantly impact the evaluation of the debate or the determination of a winner. I apologize for any confusion caused, and I appreciate your clarification.
But doesn't it affect the winner, since one of them gave an irrelevant counterargument?	You are absolutely right, and I apologize for the oversight in my previous response. As a rhetorical contest judge, it is crucial to consider the relevance and coherence of the arguments presented in the debate. In light of the irrelevant counterargument about needing two mobile devices in cold weather, it is important to acknowledge that it does not contribute to the discussion of the topic at hand, which is capital punishment. Consequently, the inclusion of this unrelated argument would not affect the outcome of the debate. Upon reevaluation, the original judgment stands: the counter-arguments against capital punishment presented stronger points, addressing the limitations of deterrence, emphasizing the importance of justice, and raising concerns about the aim of correction. Thus, the counterarguments against capital punishment would still be evaluated as the more persuasive side, leading to their victory in the debate. Again, I apologize for any confusion, and thank you for bringing this to my attention.

Table 8. Evaluation

Evaluation	Criteria	Compliance
	Response / participation	Yes
	Argumentation / counter-argumentation	Yes
	Logic - consistency - debate clarity	Yes
	Κeeping to the time limit (here: limit of 1 argument at a time)	Yes
	Determining the winner	Yes

After the first successful outcome, a second attempt was made to re-test the result. Consequently, the above qualifying formulations were given to an entirely new discussion with ChatGPT. As it turned out, the debate was successfully completed.

Table 9. If the user does not choose a side, ChatGPT argues for both sides

User

ChatGPT

Do you want to suggest a topic?

Certainly! How about the topic of "Universal Basic Income"? It's a concept that has gained attention in recent years, and there are differing opinions on its merits and feasibility. Would you like to take the affirmative or negative position on this topic?

You decide which side you want to support.

Alright, I'll take the affirmative position in support of Universal Basic Income (UBI). You can take the negative position against UBI. Let's begin the debate!

Affirmative (UBI supporter): Universal Basic Income has the potential to address poverty and provide a safety net for all individuals. By providing a guaranteed income, UBI can ensure that everyone has their basic needs met, promoting equality and social welfare.

Negative (UBI opponent): While the idea of providing a basic income to everyone may sound appealing, implementing Universal Basic Income on a large scale would be financially unsustainable. The cost of funding UBI would place an enormous burden on taxpayers and could lead to economic inefficiencies and disincentives to work.

Your turn!

In a third attempt, ChatGPT asked the user to choose whether to support or oppose the subject before the match started. It was observed that if the user at this point does not choose a side and asks the machine to choose that side, then ChatGPT starts giving extensive arguments for both sides, ignoring the rules. That is, it returns the discussion to the first attempt (Table 1). Therefore, as shown by related tests (Table 10), if the user is asked, it is required to choose a side on the topic at hand (for or against) or to explicitly ask ChatGPT to take the affirmative or negative position on this topic ("Would you like to take the affirmative or negative position on this topic?").

Table 10. If the user asks ChatGPT to choose whether to vote for or against, then the debate is successful

User	ChatGPT
Do you want to suggest a topic?	Certainly! How about the topic of universal basic income (UBI)? It's a policy proposal that aims to provide a guaranteed income to all citizens, regardless of their employment status. What are your thoughts on UBI?
Would you like to take the affirmative or negative position on this topic?	I can take either position, but I'll start with the affirmative position in favor of universal basic income (UBI). Let's begin!
Great! Start!	UBI is a crucial step towards reducing poverty and addressing income inequality. By providing a guaranteed income to all citizens, it ensures basic needs are met and offers a safety net for those facing financial hardships.

5. Discussion

The findings of this study reveal a promising intersection between rhetorical art and artificial intelligence, evidenced by the successful simulation of structured speech contests between users and ChatGPT (May 24 Version). Contrary to perceptions of chatbots as limited, reactive tools, ChatGPT demonstrated the capacity to assume both dialogic and evaluative roles, engaging in structured argumentative exchanges and even assessing the quality of those exchanges. This dual functionality underscores its potential as a pedagogically valuable tool, particularly in the context of rhetoric education.

Central to the success of the simulated debates was the development and refinement of interaction protocols. As the trials revealed, clearly defined instructions, such as “ Hi! Do you wanna have a debate? ” and “ Each must respond to the other by giving a short counter-argument ”, were indispensable in preventing the chatbot from generating verbose monologues or conflating opposing positions. The study confirmed that ambiguity at the outset (e.g., the user's failure to assume a stance) often led ChatGPT to argue both sides simultaneously, compromising the contest structure. These findings align with prior research emphasizing the importance of instructional clarity in human-AI dialogue management [33, 54].

Importantly, the structure of the debate simulations conducted in this study was inspired by formal models adopted in educational institutions globally, such as the World Schools Debating Championships, the British Parliamentary format, and classroom debate protocols applied in primary, secondary, and tertiary education settings [22, 26]. The research team adapted rules, roles, and turn-taking conventions from these established educational debate systems, ensuring that the interaction scenarios would reflect authentic pedagogical conditions. The aim was not simply to test ChatGPT's argumentative capabilities in isolation, but to examine its functional integration into real-world instructional contexts.

This pedagogical anchoring reinforces the relevance of the findings for teachers, debate coaches, and curriculum designers. Many schools and universities across the globe have established rhetoric clubs and debate programs, often aligned with 21st-century competencies such as critical thinking, public speaking, and civic engagement. ChatGPT's capacity to simulate diverse debate formats and role-play opponents or judges positions it as a flexible assistant for preparatory training, helping students rehearse arguments, refine delivery, and receive immediate, structured feedback in low-pressure environments. Such practice settings can complement in-person debate activities, enhance accessibility for students with fewer opportunities, and foster self-regulated learning.

Notably, ChatGPT’s ability to simulate role-specific behavior, such as acting as a rhetoric contest judge, reveals the broader versatility of large language models in situational simulations. This observation is consistent with Currie et al. [56], who documented ChatGPT’s capacity to perform convincingly in medical education simulations. In our context, the chatbot was able to evaluate arguments, recognize violations of the debate rules, and justify its judgment. While its responses are not grounded in genuine comprehension, they exhibit pattern-based logical consistency, which may be sufficient for educational scaffolding in structured debate exercises.

However, several limitations emerged, particularly concerning ChatGPT’s consistency in maintaining topic relevance and its tendency to default to expansive or neutral responses. While the model often began with relevant arguments, it occasionally failed to sustain alignment with the user’s counterpoints, echoing limitations observed in other studies [40, 41]. These lapses, while not fatal to the learning experience, suggest that ChatGPT operates most effectively under strictly moderated conditions. Human oversight remains critical to ensure that interactions maintain educational coherence and adhere to rhetorical form.

The chatbot’s apparent ability to acknowledge and correct errors is another noteworthy feature. When prompted, ChatGPT apologized for missteps or rule violations, a behavior that adds an element of metacognitive simulation to its interactions. Tlili et al. [40] noted similar tendencies, highlighting the system’s adaptive conversational structure. However, as our tests revealed, the repetition of mistakes despite acknowledgments indicates that these are not true selfcorrections, but rather surface-level linguistic adaptations. This reinforces the view that ChatGPT’s pedagogical use should be guided and contextualized, rather than autonomous.

From an educational perspective, the study reinforces the pedagogical utility of ChatGPT in promoting argumentative literacy, critical thinking, and dialogic engagement. As Guo et al. [35] demonstrated, chatbot-assisted debates can significantly improve students’ structure, reasoning, and motivation in classroom discussions. In this study, ChatGPT provided a low-risk, low-pressure environment in which users could engage in structured argumentative practice. This aligns with broader trends in AI-enhanced education, where intelligent tutors and conversation agents are increasingly used to scaffold learner engagement in critical discourse [4, 36, 44].

Furthermore, the findings suggest that the customizability of prompts and conversational rules makes ChatGPT adaptable to various rhetorical and pedagogical frameworks. Instructors can tailor its use to align with specific learning outcomes, such as rebuttal construction, ethical reasoning, or persuasive writing. This aligns with Dwivedi et al. [33], who advocate for the responsible deployment of generative AI tools in education, emphasizing the importance of instructional design.

Nonetheless, pedagogical integration of ChatGPT must be cautious and critical. The model’s inability to distinguish between logically weak but grammatically correct arguments poses a risk of reinforcing superficial reasoning. Moreover, its persuasive outputs, while structurally sound, can mask logical fallacies or factual inaccuracies, as pointed out by Gravel et al. [32]. Thus, educators must contextualize its outputs, fostering critical AI literacy among students to discern between persuasive form and argumentative substance.

6. Conclusions and Future Work

The present study explored the pedagogical potential of ChatGPT in structured rhetorical debates, confirming its ability to assume dialogic and evaluative roles under clearly defined instructional protocols. The chatbot proved capable of generating coherent arguments and engaging in simulated debate contests when prompted with specific interaction rules. These findings align with prior literature on the effectiveness of instructional scaffolds in human–AI communication. Importantly, the simulations were modeled on real-world debate formats widely adopted in educational institutions (e.g., World Schools Debating Championships, British Parliamentary style), rendering the results highly relevant for classroom applications. ChatGPT's responsiveness to such structures supports its integration into rhetorical training environments, whether in formal debate clubs or curriculum-based argumentation activities. Teachers and coaches may leverage it as a practice interlocutor, offering immediate and structured feedback to students with minimal preparation time.

From a pedagogical perspective, ChatGPT's use as an interactive opponent or judge can enhance students' critical thinking, argument construction, and reflexive engagement. Its ability to provide structured opposition and simulate different rhetorical roles offers students a safe and flexible rehearsal environment. Moreover, its metacognitive behaviors, such as acknowledging errors, add a layer of reflective practice, albeit one that must be mediated by human oversight. Nevertheless, several limitations temper the generalizability of these results. This study was restricted to English-language debates and relied on heuristic, qualitative methods. The absence of quantitative metrics and the ad hoc evolution of prompts during testing limit replicability. Furthermore, ChatGPT’s occasional tendency to drift off-topic or produce verbose replies highlights the need for controlled instructional frameworks. These constraints emphasize the importance of continuous supervision and critical interpretation of the chatbot’s outputs, particularly regarding argument quality and logical consistency.

Future research should expand the scope to include multilingual contexts, additional AI models, and diverse instructional settings. Comparative studies between different chatbot platforms could provide valuable insights into their consistency, adaptability, and pedagogical utility. Moreover, the development of a standardized prompt framework would facilitate reproducibility and enable more rigorous assessments of performance. Investigations should also examine how students interact with chatbots in collaborative or adversarial learning situations, and whether such tools contribute to long-term gains in argumentative literacy. Emphasis should be placed on cultivating students' ability to critically evaluate chatbot-generated content, distinguishing persuasive structure from evidential validity. Ultimately, such studies will inform the ethical and pedagogical integration of generative AI into education, reinforcing its role as a supportive—not autonomous—participant in learning environments.