Can generative AI serve as a transformative educational tool for facial observation training in traditional Chinese medicine? An exploratory study
Автор: Medeiros E. F.
Журнал: Вестник Международной академии наук (Русская секция) @vestnik-rsias
Рубрика: Медико-биологические науки
Статья в выпуске: 1, 2025 года.
Бесплатный доступ
This study investigates the role of Generative Artificial Intelligence (AI) in supporting educational tools for facial observation (望诊, Wàng zhěn), a diagnostic method in Traditional Chinese Medicine (TCM). Rooted in classical texts such as Huangdi Neijing (黄帝内经), the research focuses on creating AI-generated imagery to complement educational resources. These visual tools enrich learning materials, improve accessibility for diverse populations, and address gaps in traditional training methods, enhancing students' ability to understand and recognize diagnostic markers.
Ai (artificial intelligence), traditional chinese medicine, facial observation, education, semantic bridging, training
Короткий адрес: https://sciup.org/143184539
IDR: 143184539
Текст научной статьи Can generative AI serve as a transformative educational tool for facial observation training in traditional Chinese medicine? An exploratory study
Traditional Chinese Medicine (TCM) is deeply rooted in the philosophical concepts of Yin-Yang, the Five Elements, and the flow of Qi. Central to TCM is the diagnostic process, which relies on the Four Diagnostic Methods ( 四诊法 , sì zhěn fǎ ): observation ( 望 , Wàng ), listening and smelling ( 闻 , Wén ), questioning ( 问 , Wèn ), and palpation ( 切 , Qiè ). Among these, facial observation ( 望诊 , Wàng zhěn ) plays a crucial role, allowing practitioners to assess a patient’s internal health by examining external signs such as complexion, facial expressions, and subtle changes in skin texture and color. The face is considered a mirror of the internal organs, with specific areas corresponding to the Heart, Liver, Spleen, Lungs, and Kidneys. Changes in facial color, moisture, and texture can reveal imbalances, such as Blood Deficiency (pale complexion), Heat (reddish hue), or Spleen Qi Deficiency (swelling) [1, 2].
This practice is grounded in centuries of theoretical and clinical knowledge, drawing from classical TCM texts like the Huangdi Neijing ( 黄帝内经 ), particularly the Suwen ( 素问 ) and Lingshu ( 灵枢 ), which emphasize the significance of visual diagnosis in understanding internal health conditions [3].
The Suwen emphasizes the importance of observation in diagnosis, stating: «A good diagnostician observes the color and feels the pulse, distinguishing first between Yin and Yang; assessing purity and turbidity to understand the affected parts» . ( 善诊者 , 察色按脉 , 先别阴阳 ; 审清浊 , 而知部分 ) [4]. The Lingshu further elaborates on the relationship between facial features and internal organs, noting: «The face reflects the external, while the five organs are observed internally». ( 面色察于外 , 五脏观于内 ) [5].
Additionally, the Suwen poetically illustrates the complementary nature of observation and pulse diagnosis, stating: «Color corresponds to the Sun, and the pulse to the Moon. To constantly seek the essential is to understand the relationship between color and pulse». ( 色以应日 , 脉以应月 , 常求其要 , 则色脉是矣 ) [6]. Here, «color» refers to the facial complexion, which, along with pulse diagnosis, forms a holistic approach to understanding the patient's condition.
In modern medical practice, non-invasive diagnostic techniques are fundamental. In Intensive Care Units (ICUs), clinicians assess patients by observing skin color, respiratory rate, and mental status. For example, evaluating skin color can reveal conditions like pallor or cyanosis, indicating underlying issues [7].
Similarly, respiratory rate is monitored to detect abnormalities such as tachypnea or bradypnea, which may signal respiratory distress [8]. Mental status assessments, including evaluations of alertness and cognition, are important for identifying alterations that may require immediate attention [9].
Traditional Chinese Medicine (TCM) employs facial observation as a non-invasive diagnostic tool. Practitioners examine facial features, skin color, and expressions to infer internal health conditions. For instance, specific facial regions correspond to internal organs, and changes in these areas can indicate imbalances [10]. However, mastering this technique demands extensive experience and a profound understanding of TCM principles. Students and novice practitioners often find it challenging to discern subtle signs, as traditional educational materials predominantly offer textual descriptions with limited visual support [11].
Integrating advanced visual aids and educational tools can enhance the learning process for TCM students, providing clearer references and aiding in the recognition of diagnostic cues [12]. Advanced technologies, such as AI-generated visual materials, could bridge the gap between textual descriptions and practical applications. These tools offer the potential to train practitioners more effectively, helping them to recognize subtle diagnostic markers with improved accuracy [13]. Further research is in progress to refine such tools and evaluate their impact on the education and practice of Traditional Chinese Medicine [14].
In recent years, Generative Artificial Intelligence (AI) has emerged as a transformative tool in healthcare, particularly in medical education and clinical practice. Artificial intelligence (AI) models can generate high-quality, realistic images from textual descriptions, offering innovative solutions in fields like Traditional Chinese Medicine (TCM), where traditional resources often lack detailed visual aids. These synthetic images accurately depict various medical conditions, including TCM syndromes across different severity levels, thereby enhancing students' ability to recognize and differentiate these syndromes. Additionally, AI-generated visuals address ethical concerns regarding patient privacy by providing a compliant alternative to real patient images while maintaining realism and utility in educational settings [14].
Generative AI also tackles the issue of diversity in TCM education. Traditional resources frequently focus on Asian facial features, leaving students unprepared to diagnose patients from other ethnic backgrounds. AI can generate diverse representations of syndromes, incorporating a wide range of ethnicities, ages, and body types. This inclusivity ensures that students are better equipped to serve diverse patient populations, promoting equitable and effective healthcare practices globally. By addressing gaps in visual resources, ethical concerns, and diversity, Generative AI is reshaping TCM education for a more comprehensive and inclusive future [14].
This exploratory pilot study investigates how Generative AI can serve as a supportive tool for facial observation training in Traditional Chinese Medicine (TCM). The study focuses on creating realistic and diverse synthetic images of TCM syndromes to explore their potential in enhancing educational resources and addressing gaps in TCM training. These images aim to help students improve their ability to recognize and differentiate syndromes, offering a more practical and immersive learning experience. The study also addresses the lack of diversity in traditional TCM materials, which often emphasize a limited range of facial features. By generating synthetic images that represent a wide variety of ethnicities, ages, and body types, the project helps students develop the skills to observe and understand facial characteristics across diverse populations.
Description of the Method and Its Implementation
The challenge of generating accurate images in Stable Diffusion (SD) and ensuring precise analysis by a Large Language Model (LLM) lies in translating complex Traditional Chinese Medicine (TCM) concepts into universal visual terms. For example, SD cannot directly interpret terms like «Phlegm obstructing the Heart». To address this issue, the Semantic Bridging technique deconstructs complex Traditional Chinese Medicine (TCM) concepts into detailed and structured visual descriptions. Semantic Bridging proposes breaking down abstract TCM terminology into structured visual descriptions, making it easier for image generation models to interpret. These descriptions are designed to be accurately rendered by Stable Diffusion (SD), while enabling Large Language Models (LLMs) to interpret and associate the images with their corresponding syndromes. This approach is informed by methodologies explored in recent academic studies, such as Zhao et al. (2024), which explore the integration of diverse pre-trained language and vision models to enhance text-to-image generation. However, the Semantic Bridging technique was tailored to suit the simplified and exploratory nature of this study, providing a practical and straightforward method for generating meaningful and visually accurate representations of TCM syndromes.
Within this framework, TCM syndromes are translated into detailed visual descriptions that SD can interpret. For instance, the syndrome Phlegm Misting the Heart ( 痰迷心窍 , Tán Mí Xīn Qiào ) is characterized by the obstruction of the heart orifices by phlegm, leading to mental and physical symptoms. In Chinese, this syndrome is described as:
痰迷心窍是指痰浊蒙闭心窍所表现的征候。多由 温浊内留 , 久而化痰 , 或情志不畅 , 郁而生痰引起。主要 临床表现包括 : 神志异常 : 意识模糊、精神抑郁、举止失 常、喃喃自语。身体症状 : 喉间痰鸣、胸闷作恶、舌苔 白腻、脉滑。严重症状 : 卒然昏倒、不省人事、口吐痰 涎、手足抽搐、两目上视、口中如猪羊叫声。
In English, this syndrome is characterized by the obstruction of the heart orifices by phlegm, leading to mental and physical symptoms. The clinical manifestations are systematically organized into the following categories: mental symptoms such as mental confusion, depression, incoherent speech, and abnormal behavior; physical symptoms like rattling sounds in the throat, chest oppression, a thick white tongue coating, and a slippery pulse; and severe symptoms such as sudden unconsciousness, vomiting of phlegm, convulsions, an upward gaze, and sounds resembling animal cries.
Symptom Scaling (Weighting) in Image Generation
In this study, the scaling (or weighting) of symptoms plays a critical role in the generation of photorealistic images using Generative AI models, particularly RealVisXL v4.0 Lightning. This scaling system is designed to assign numerical values to specific clinical features, reflecting their relative prominence or severity within a given TCM syndrome. The indices used, such as 1.2, 1.5, and 1.8, represent the intensity or dominance of particular symptoms in the overall clinical presentation.
Rationale for Scaling
The rationale behind this approach is rooted in the need to guide the AI model in accurately emphasizing certain visual characteristics over others. For instance, in the case of Phlegm Misting the Heart ( 痰迷心窍 , Tán Mí Xīn Qiào ), mental confusion is a hallmark symptom and thus requires a higher visual emphasis compared to more subtle features like mild facial pallor. By applying weighted indices, the AI can generate images that reflect the nuanced spectrum of symptom severity observed in clinical practice.
Explanation of Weighting Indices
-
• 1.0 (Baseline): Represents a neutral or standard level of emphasis. Symptoms with this weight are considered present but not particularly dominant.
-
• 1.2–1.4 (Mild Emphasis): Indicates a slight prominence of the symptom. For example, a slightly down-turned mouth or subtle facial swelling might be weighted at 1.2 to reflect a mild presentation.
-
• 1.5–1.7 (Moderate Emphasis): Denotes moderate prominence. Symptoms with this weighting, such as a vacant gaze or noticeable facial tension, are more visually pronounced and central to the syndrome’s depiction.
-
• 1.8–2.0 (High Emphasis): Signifies symptoms that are critical or defining features of the syndrome. Mental fog and confusion, when severe, may be weighted at 1.8 or higher to ensure these characteristics dominate the visual output.
Application in Image Generation
When crafting prompts for image generation, these weights are embedded within descriptive tags to instruct the AI model on how strongly to emphasize each feature. For example:
«A 45-year-old Asian man with (confused and incoherent expression:1.8), (heavy-lidded eyes:1.7), and (slightly pale, swollen cheeks:1.4). The posture is (mildly slouched:1.3), reflecting (mental fog:1.8) and (lack of clar-ity:1.7)».
This structured approach ensures that the generated images accurately represent the clinical variability seen in TCM syndromes, enhancing the educational value of the visual materials.
Implementation Using RealVisXL v4.0 Lightning
Once the prompts were fully optimized using this weighted approach, the next stage involved generating the actual images. Following preliminary tests with other models, the study transitioned to utilizing RealVisXL v4.0 Lightning, a specialized text-to-image AI model based on Stable Diffusion. RealVisXL was selected for its proven capability to convert detailed textual descriptions into photorealistic representations of individuals exhibiting TCM syndromes, successfully addressing limitations encountered in earlier trials.
The model generated high-fidelity images with satisfactory precision, aligning with the exploratory nature of this study and the computational resources available. This methodology highlights the study's goal of exploring image generation within practical limits, providing a foundation for future research using advanced modular tools to create even more detailed and accurate visuals.
Discussion of the Results
Pathophysiological Overview of the Syndrome
Phlegm Misting the Heart ( 痰迷心窍 , Tán Mí Xīn Qiào ) represents a complex manifestation within Traditional Chinese Medicine (TCM), characterized by a profound disruption in the integrative functions of the mind ( 神 , shén ) and sensory orifices. This disruption arises from the pathological accumulation of turbid phlegm obstructing the heart orifices, thereby impeding the free flow of Qi and the clarity of consciousness. Clinically, this syndrome manifests through disturbances in awareness, perception, and emotional regulation, consistent with TCM's understanding of the heart-mind connection. While certain symptoms may align with what modern medicine classifies as neurocognitive impairments, the TCM framework remains the primary lens for interpretation, with Western neuropsychiatric concepts serving as a complementary reference for comparative analysis.
Objective of the Visual Representations
The visual representations are designed to clearly show the progression of Phlegm Misting the Heart. They offer a continuous display of symptoms that helps identify gradual cognitive and emotional changes. Their goal is to improve diagnostic accuracy among students and practitioners by showing the shift from minor mental detachment to more noticeable disruptions in consciousness. A common feature across all stages is a disconnection from external reality, seen in vacant eyes and reduced emotional expression.
Symptomatological Stratification
Additional images, including detailed prompts and a tutorial, can be found at
Mild Presentations

The images depicting the mild presentation capture early-stage manifestations characterized by nuanced signs of mental disengagement. Observable features include transient episodes of disorientation, minimal facial asymmetry, and a subtly diminished expressiveness. The gaze, although occasionally unfocused, retains intermittent responsiveness, accompanied by a faint emotional flattening. These visuals are intended to sensitize observers to the insidious onset of mental dissociation, often overlooked in clinical evaluations.
Moderate Presentations

Moderate presentation shows a clear increase in cognitive and emotional impairment. It features a consistently vacant gaze toward unclear points, a slower blink rate, and hypomimia with reduced emotional and sensory response. The first image was created to display signs of distress and stress aiming to be used in clinical training to show how early, unnoticed symptoms can interfere with daily activities and reveal the psychological burden on patients.
Severe Presentations

The severe presentation delineates a state of profound internal withdrawal, characterized by an almost complete severance of cognitive and sensory integration. The gaze is devoid of recognition or engagement, epitomizing extreme internal withdrawal — a diagnostic hallmark consistent across all severity levels in TCM. Facial features display pronounced hypotonia, with heavy-lidded eyes, downward-drawn oral commissures, and an expressionless visage, collectively indicative of advanced mental stagnation. Concomitant signs include sudden episodes of unconsciousness, convulsions, and incoherent vocalizations. Some features may resemble neuropsychiatric conditions from a Western perspective, but in TCM, they are seen as manifestations of phlegm obstructing the heartmind pathways.
Pedagogical Relevance of the Visual Representations
The curated images are practical tools to show the gradual intensification of Phlegm Misting the Heart. They illustrate the progression of mental and emotional decline, from subtle disconnection to severe impairment of consciousness. By visually mapping these changes, the representations aim to strengthen the diagnostic skills of TCM practitioners, expanding their understanding of syndrome differentiation and the interpretation of clinical signs within the TCM framework. Western medical concepts are used as comparative references, not as definitive classifications.
Semantic Bridging helps translate complex TCM terminology into accurate visual representations. Techniques like semantic alignment through Stable Diffusion models and advanced prompt encoding strategies produce photorealistic visuals that reflect TCM theory [19–21]. This approach supports human expertise in diagnostic methods like facial observation (望诊, Wàng zhěn), positioning AI as an educational aid rather than a replacement for clinical judgment.
Perspectives of the Study
Traditional textual materials present challenges for students, as abstract descriptions can be difficult to apply in practice. Realistic and diverse visual representations are introduced to enrich educational resources, strengthen comprehension of diagnostic concepts, and accelerate skill development in TCM training.
However, challenges related to cultural, ethnic, and gender biases were observed during the image generation process. Some outputs reflected stereotypes or inadequate representation, highlighting the need for careful evaluation before their integration into educational materials. Addressing these limitations requires the implementation of bias evaluation protocols and the inclusion of clear disclaimers to contextualize the potential shortcomings of AI-generated visuals.