MambaResp-KAN: A State Space Model with Kolmogorov–Arnold Networks and Diffusion-Based Augmentation for Explainable Respiratory Disease Classification

Автор: Mohammed Tawfik

Журнал: International Journal of Image, Graphics and Signal Processing @ijigsp

Статья в выпуске: 3 vol.18, 2026 года.

Бесплатный доступ

Automated respiratory disease classification from auscultation sounds holds transformative potential for early clinical screening, yet existing approaches remain constrained by the quadratic complexity of Transformer-based sequence encoders, the limited expressiveness of conventional multi-layer perceptron classifiers, and the persistent challenge of scarce annotated medical audio data. This paper presents MambaResp-KAN, a novel architecture that unifies Bidirectional Mamba state space models, Kolmogorov–Arnold Network classifiers with learnable B-spline activation functions, multi-modal gated cross-attention fusion of WavLM, BEATs, and handcrafted spectral features, and class-conditional denoising diffusion probabilistic model augmentation into a single end-to-end framework for explainable respiratory sound analysis. The Bidirectional Mamba encoder achieves linear-time sequence modeling through input-dependent selective state space discretization, processing forward and reverses temporal streams with gated aggregation to capture both causal and anti-causal dependencies in respiratory waveforms. The Kolmogorov–Arnold Network classifier replaces fixed-activation neurons with learnable univariate B-spline functions on each network edge, directly grounded in the Kolmogorov–Arnold representation theorem, yielding a classifier that is both more parameter-efficient and intrinsically interpretable than standard multi-layer perceptrons. A gated cross-modal attention mechanism fuses embeddings from the self-supervised WavLM and BEATs audio encoders with handcrafted MFCC and spectral features, while a class-conditional denoising diffusion probabilistic model synthesizes high-fidelity respiratory audio to alleviate class imbalance. Integrated Gradients attribution and KAN concept bottleneck analysis provide clinician-interpretable explanations of model decisions. Evaluated on two benchmark datasets, Asthma Detection V2 (five classes, 1,211 samples) and KAUH (four classes, 940 samples), MambaResp-KAN achieves classification accuracies of 99.6% and 99.4%, respectively, surpassing the prior state-of-the-art E-RespiNet by 0.7 and 0.6 percentage points while using 62% fewer parameters and reducing inference latency by 56.3%. Cross-dataset evaluation yields an average accuracy of 84.0% with a generalization gap of 15.8%, compared to 23.3% for E-RespiNet, confirming improved transferability across clinical institutions.

Еще

Respiratory disease classification, State space models, Mamba, Kolmogorov–Arnold Networks, B-spline activations, Diffusion augmentation, Explainable AI, Multi-modal fusion, WavLM, BEATs

Короткий адрес: https://sciup.org/15020414

IDR: 15020414   |   DOI: 10.5815/ijigsp.2026.03.10