Статьи журнала - International Journal of Image, Graphics and Signal Processing

Все статьи: 1110

Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Navneet Upadhyay, Abhijit Karmakar

Статья научная

The spectral subtraction method is a classical approach for enhancement of speech degraded by additive background noise. The basic principle of this method is to estimate the short-time spectral magnitude of speech by subtracting estimated noise spectrum from the noisy speech spectrum. This is also achieved by multiplying the noisy speech spectrum with a gain function and later combining it with the phase of the noisy speech. Besides reducing the background noise, this method introduces an annoying perceptible tonal characteristic in the enhanced speech and affects the human listening, known as remnant musical noise. Several variations and implementations of this method have been adopted in past decades to address the limitations of spectral subtraction method. These variations constitute a family of subtractive-type algorithms and operate in frequency domain. The objective of this paper is to provide an extensive overview of spectral subtractive-type algorithms for enhancement of noisy speech. After the review, this paper is concluded by mentioning a future direction of speech enhancement research from spectral subtraction perspective.

Бесплатно

Spectral and Time Based Assessment of Meditative Heart Rate Signals

Spectral and Time Based Assessment of Meditative Heart Rate Signals

Ateke Goshvarpour, Mousa Shamsi, Atefeh Goshvarpour

Статья научная

The objective of this article was to study the effects of Chi meditation on heart rate variability (HRV). For this purpose, the statistical and spectral measures of HRV from the RR intervals were analyzed. In addition, it is concerned with finding adequate Auto-Regressive Moving Average (ARMA) model orders for spectral analysis of the time series formed from RR intervals. Therefore, Akaike's Final Prediction Error (FPE) was taken as the base for choosing the model order. The results showed that overall the model order chosen most frequently for FPE was p = 8 for before meditation and p = 5 for during meditation. The results suggested that variety of orders in HRV models upon different psychological states could be due to some differences in intrinsic properties of the system.

Бесплатно

Speech Emotion Recognition based on SVM as Both Feature Selector and Classifier

Speech Emotion Recognition based on SVM as Both Feature Selector and Classifier

Amirreza Shirani, Ahmad Reza Naghsh Nilchi

Статья научная

The aim of this paper is to utilize Support Vector Machine (SVM) as feature selection and classification techniques for audio signals to identify human emotional states. One of the major bottlenecks of common speech emotion recognition techniques is to use a huge number of features per utterance which could significantly slow down the learning process, and it might cause the problem known as "the curse of dimensionality". Consequently, to ease this challenge this paper aims to achieve high accuracy system with a minimum set of features. The proposed model uses two methods, namely "SVM features selection" and the common "Correlation-based Feature Subset Selection (CFS)" for the feature dimensions reduction part. In addition, two different classifiers, one Support Vector Machine and the other Neural Network are separately adopted to identify the six emotional states of anger, disgust, fear, happiness, sadness and neutral. The method has been verified using Persian (Persian ESD) and German (EMO-DB) emotional speech databases, which yield high recognition rates in both databases. The results show that SVM feature selection method provides better emotional speech-recognition performance compared to CFS and baseline feature set. Moreover, the new system is able to achieve a recognition rate of (99.44%) on the Persian ESD and (87.21%) on Berlin Emotion Database for speaker-dependent classification. Besides, promising result (76.12%) is obtained for speaker-independent classification case; which is among the best-known accuracies reported on the mentioned database relative to its little number of features.

Бесплатно

Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

Speech Enhancement Using Joint Time and DCT Processing for Real Time Applications

Ravi Kumar Kandagatla, V. Jayachandra Naidu, P.S. Sreenivasa Reddy, Sivaprasad Nandyala

Статья научная

Deep learning based speech enhancement approaches provides better perceptual quality and better intelligibility. But most of the speech enhancement methods available in literature estimates enhanced speech using processed amplitude, energy, MFCC spectrum, etc along with noisy phase. Because of difficult in estimating clean speech phase from noisy speech the noisy phase is still using in reconstruction of enhanced speech. Some methods are developed for estimating clean speech phase and it is observed that it is complex for estimation. To avoid difficulty and for better performance rather than using Discrete Fourier Transform (DFT) the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) based convolution neural networks are proposed for better intelligibility and improved performance. However, the algorithms work either features of time domain or features of frequency domain. To have advantage of both time domain and frequency domain here the fusion of DCT and time domain approach is proposed. In this work DCT Dense Convolutional Recurrent Network (DCTDCRN), DST Convolutional Gated Recurrent Neural Network (DSTCGRU), DST Convolution Long Short term Memory (DSTCLSTM) and DST Convolutional Gated Recurrent Neural Network (DSTDCRN) are proposed for speech enhancement. These methods are providing superior performance and less processing difficulty when compared to the state of art methods. The proposed DCT based methods are used further in developing joint time and magnitude based speech enhancement method. Simulation results show superior performance than baseline methods for joint time and frequency based processing. Also results are analyzed using objective performance measures like Signal to Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI).

Бесплатно

Speech Enhancement based on Wavelet Thresholding the Multitaper Spectrum Combined with Noise Estimation Algorithm

Speech Enhancement based on Wavelet Thresholding the Multitaper Spectrum Combined with Noise Estimation Algorithm

P.Sunitha, K.Satya Prasad

Статья научная

This paper presents a method to reduce the musical noise encountered with the most of the frequency domain speech enhancement algorithms. Musical Noise is a phenomenon which occurs due to random spectral speaks in each speech frame, because of large variance and inaccurate estimate of spectra of noisy speech and noise signals. In order to get low variance spectral estimate, this paper uses a method based on wavelet thresholding the multitaper spectrum combined with noise estimation algorithm, which estimates noise spectrum based on the spectral average of past and present according to a predetermined weighting factor to reduce the musical noise. To evaluate the performance of this method, sine multitapers were used and the spectral coefficients are threshold using Wavelet thresholding to get low variance spectrum .In this paper, both scale dependent, independent thresholdings with soft and hard thresholding using Daubauchies wavelet were used to evaluate the proposed method in terms of objective quality measures under eight different types of real-world noises at three distortions of input SNR. To predict the speech quality in presence of noise, objective quality measures like Segmental SNR ,Weighted Spectral Slope Distance ,Log Likelihood Ratio, Perceptual Evaluation of Speech Quality (PESQ) and composite measures are compared against wavelet de-noising techniques, Spectral Subtraction and Multiband Spectral Subtraction provides consistent performance to all eight different noises in most of the cases considered.

Бесплатно

Speech Enhancement through Implementation of Adaptive Noise Canceller Using FHEDS Adaptive Algorithm

Speech Enhancement through Implementation of Adaptive Noise Canceller Using FHEDS Adaptive Algorithm

Ch.D.Umasankar, M. Satya Sai Ram

Статья научная

Speech analysis is the modelling and estimating of the different speech characteristics that would provide the importance on each set of criteria established on the real time applications. One such analytic section in enhancement process on speeches would improve the need of speech enhancement. This paper compares the performance analysis of our proposed Fast Hybrid Euclidean Direction Search (FHEDS) algorithm with other adaptive algorithms such as NHP and FEDS algorithm. These algorithms have been tested for their adaptive noise cancellation of speech signal corrupted by different noises such as Babble, Factory, Destroy Engine, Car, Fire Engine and Train Noises. Ensuring the design criteria with current design limits of the database and its analysis have been encapsulated with each phase of design with Noise model, improving the better performance aspects. The relative factors for comparisons have been tabulated with each set of the noise and clear speech data with proposed filter operation. The proposed model effectively reduces the noise for achieving better speech enhancement. The proposed model achieves high Signal-to-Noise Ratio (SNR) when compared to traditional models.

Бесплатно

Speech Feature Extraction for Gender Recognition

Speech Feature Extraction for Gender Recognition

Anjali Pahwa, Gaurav Aggarwal

Статья научная

Speech Recognition Technology can be embedded in various real time applications in order to increase the human-computer interaction. From robotics to health care and aerospace, from interactive voice response systems to mobile telephony and telematics, speech recognition technology have enhanced the human-machine interaction. Gender recognition is an important component for the application embedding speech recognition as it reduces the computational complexity for the further processing in these applications. The paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives. We extracted 13 values for each of these from a data-set 46 speech samples containing the Hindi vowels (आ, इ, ई, उ, ऊ, ऋ, ए, ऎ, ऒ, ऑ) and trained them using a combined model of SVM and neural network classification to determine their gender using stacking. The results obtained showed the accuracy of 93.48% after taking into consideration the first Mel coefficient. The purpose of this study was to extract the correct features and to compare the performance based on first Mel coefficient.

Бесплатно

Spliced image classification and tampered region localization using local directional pattern

Spliced image classification and tampered region localization using local directional pattern

Surbhi Sharma, Umesh Ghanekar

Статья научная

In this paper the authors have proposed a spliced image detection algorithm based on Local Directional Pattern (LDP). The output of many splicing detection techniques is either to classify spliced image from authentic images or to localize the spliced region. But the proposed algorithm has ability to classify and to localize the spliced region. First, the original image (RGB color space) is converted to Ycbcr color space. The histogram of LDP of chrominance component of suspect image is used in classification. Whereas for localization of spliced region, the chrominance component of input image is divide into overlapping blocks; then, the LDP of each block is calculated. The standard deviation of each block is used as clue to visualize the spliced region. The experimental results are calculated in terms of accuracy, specificity (true negative tare), sensitivity (true positive rate) and error rate and proves effectiveness of the proposed algorithm. The accuracy of the proposed algorithm is 98.55 %. The algorithm is also robust against post splicing image processing operation such as gaussian blur, additive white gaussian noise, JPEG compression and scaling however, previous techniques have not considered these experimental environment.

Бесплатно

Stabilogram mPCA Decomposition and Effects Analysis of Several Entries on The Postural Stability

Stabilogram mPCA Decomposition and Effects Analysis of Several Entries on The Postural Stability

Dhouha MAATAR, Zied LACHIRI, Régis FOURNIER, Amine NAIT-ALI

Статья научная

This paper presents an analysis of stabilogram using the modified Principal Component Analysis (mPCA) decomposition which will be employed to highlight the effects of different aspects on the human postural stability. The aim of this study is to analyze stabilogram center of pressure time series using the mPCA decomposition method. The mPCA is a decomposition method applied to a complex signal. It decomposes the stabilogram, considered as an additive model, into three components: trend, rambling and trembling. The study of the trace of analytic trembling (respectively of rambling) in the complex plan highlights a unique rotation center. So the phase is defined and two parameters are extracted: the area of the circle in which 95% of the trace's data points are located and the angular frequency. In this study 25 healthy volunteers (average age 31± 11 years) are required to stand upright on an electromagnetic platform either with eyes closed or open and with feet outspread or tighten. Experimental results show the efficiency of the parameter area to identify the effect of visual, proprioceptive and directional entries on the postural stability. These results are able to discriminate between control and young groups and indicate a less well-controlled posture for control subjects (34.5± 7.5y) relatively to young subjects (22.5 ±2. 5y). Results serve also to display that female subjects are more stable than males, that fat subjects are more stable than thin and that tall subjects are more stable than small.

Бесплатно

Statistical Image Classification for Image Steganographic Techniques

Statistical Image Classification for Image Steganographic Techniques

Seyyed Amin Seyyedi, Nick Ivanov

Статья научная

Steganography is the method of information hiding. Free selection of cover image is a particular preponderance of steganography to other information hiding techniques. The performance of steganographic system can be improved by selecting the reasonable cover image. This article presents two level unsupervised image classification algorithm based on statistical characteristics of the image which helps Sender to make reasonable selection of cover image to enhance performance of steganographic method based on his specific purpose. Experiments demonstrate the effect of classification in satisfying steganography requirements.

Бесплатно

Statistical Texture Features Based Automatic Detection and Classification of Diabetic Retinopathy

Statistical Texture Features Based Automatic Detection and Classification of Diabetic Retinopathy

Md. Rahat Khan, A. S. M. Shafi

Статья научная

Diabetes is a globally prevalent disease that can cause microvascular compilation such as Diabetic Retinopathy (DR) in the human eye organs and it might prompt a significant reason for visual deficiency. The present study aimed to develop an automatic detection and classification system to diagnosing diabetic retinopathy from digital fundus images. An automated diabetic retinopathy detection and classification system from retinal images is proposed in our work to reduce the workload of ophthalmologists. This work comprises three main stages. Our proposed method first extracts the blood vessels from color fundus image. Secondly, the method detects whatever the input image as normal or diabetic retinopathy and then illustrates an automatic diabetic retinopathy classification technique through statistical texture features. It embeds Gray Level Co-occurrence Matrix (GLCM) and Gray Level Run Length Matrix (GLRLM) for second-order and higher-order statistical texture feature as a feature extraction technique into three renowned classifiers namely K-Nearest Neighbor (KNN), Random Forest (RF) and Support Vector Machine (SVM). The evaluation results containing a dataset of 644 retinal images indicate that the proposed method based on random forest classifier is found to be effective with a weighted sensitivity, precision, F1-score and accuracy of 95.53% 96.45%, 95.38% and 95.19% respectively for the detection and classification of diabetic retinopathy. These outcomes propose, that the method could decrease the cost of screening and diagnosis while achieving higher than suggested performance and that the system could be implemented in clinical assessments requiring better evaluating.

Бесплатно

Steganography Based on Integer Wavelet Transform and Bicubic Interpolation

Steganography Based on Integer Wavelet Transform and Bicubic Interpolation

N. Ajeeshvali, B.Rajasekhar

Статья научная

Steganography is the art and science of hiding information in unremarkable cover media so as not to observe any suspicion. It is an application under information security field, being classified under information security, Steganography will be characterized by having set of measures that rely on strengths and counter attacks that are caused by weaknesses and vulnerabilities. The aim of this paper is to propose a modified high capacity image steganography technique that depends on integer wavelet transform with acceptable levels of imperceptibility and distortion in the cover image as a medium file and high levels of security. Bicubic interpolation causes overshoot, which increases acutance (apparent sharpness). The Bicubic algorithm is frequently used for scaling images and video for display. The algorithm preserves fine details of the image better than the common bilinear algorithm.

Бесплатно

Stochastic Characterization of a MEMs based Inertial Navigation Sensor using Interval Methods

Stochastic Characterization of a MEMs based Inertial Navigation Sensor using Interval Methods

Subhra Kanti Das, Dibyendu Pal, Virendra Kumar, S. Nandy, Kumardeb Banerjee, Chandan Mazumdar

Статья научная

The aim here remains to introduce effectiveness of interval methods in analyzing dynamic uncertainties for marine navigational sensors. The present work has been carried out with an integrated sensor suite consisting of a low cost MEMs inertial sensor, GPS receiver of moderate accuracy, Doppler velocity profiler and a magnetic fluxgate compass. Error bounds for all the sensors have been translated into guaranteed intervals. GPS based position intervals are fed into a forward-backward propagation method in order to estimate interval valued inertial data. Dynamic noise margins are finally computed from comparisons between the estimated and measured inertial quantities It has been found that the intervals as estimated by proposed approach are supersets of 95% confidence levels of dynamic errors of accelerations. This indicates a significant drift of dynamic error in accelerations which may not be clearly defined using stationary error bounds. On the other side bounds of non-stationary error for rate gyroscope are found to be in consistence with the intervals as predicted using stationary noise coefficients. The guaranteed intervals estimated by the proposed forward backward contractor, are close to 95% confidence levels of stationary errors computed over the sampling period.

Бесплатно

Studies on Texture Segmentation Using D-Dimensional Generalized Gaussian Distribution integrated with Hierarchical Clustering

Studies on Texture Segmentation Using D-Dimensional Generalized Gaussian Distribution integrated with Hierarchical Clustering

K. Naveen Kumar, K. Srinivasa Rao, Y.Srinivas, Ch. Satyanarayana

Статья научная

Texture deals with the visual properties of an image. Texture analysis plays a dominant role for image segmentation. In texture segmentation, model based methods are superior to model free methods with respect to segmentation methods. This paper addresses the application of multivariate generalized Gaussian mixture probability model for segmenting the texture of an image integrating with hierarchical clustering. Here the feature vector associated with the texture is derived through DCT coefficients of the image blocks. The model parameters are estimated using EM algorithm. The initialization of model parameters is done through hierarchical clustering algorithm and moment method of estimation. The texture segmentation algorithm is developed using component maximum likelihood under Bayesian frame. The performance of the proposed algorithm is carried through experimentation on five image textures selected randomly from the Brodatz texture database. The texture segmentation performance measures such as GCE, PRI and VOI have revealed that this method outperform over the existing methods of texture segmentation using Gaussian mixture model. This is also supported by computing confusion matrix, accuracy, specificity, sensitivity and F-measure.

Бесплатно

Study for License Plate Detection

Study for License Plate Detection

Mie Mie Aung, Phyu Phyu Khaing, Myint San

Статья научная

License Plate Detection (LPD) system is the application of computer vision and image processing technology. LPD system is the first and main step of License Plate Recognition (LPR) system. So, it performs as the main driver of the LPR system. License plate detection step is always performed in front of the license plate recognition step. LPD system takes the vehicle images as input, follows with the general steps: such as reprocessing, localization, region extraction, and region detection, and the detected image are the output of the system. There are many algorithms for LPD while detecting a license plate in different conditions is still a complex task. For the LPD system, morphological operation and deep learning model are mostly used. This paper presents the critical study of the license plate detection system and also examines the implementation of new technologies of the license plate detection system.

Бесплатно

Study of Noise Detection and Noise Removal Techniques in Medical Images

Study of Noise Detection and Noise Removal Techniques in Medical Images

Bhausaheb Shinde, Dnyandeo Mhaske, A.R. Dani

Статья научная

In this work we taken different medical images like MRI, Cancer, X-ray, and Brain and calculated standard derivations and mean of all these medical images. To finding salt & pepper noise and then applied median filtering technique for removal of noise. After removing a noise by using median filtering techniques again standard derivations and mean are evaluated. This experimental analysis will improve the accuracy of MRI, Cancer, X-ray and Brain images for easy diagnosis. The results, which we have achieved, are more useful and they prove to be helpful for general medical practitioners to analyze the symptoms of the patients.

Бесплатно

Study of segmentation techniques for assessment of osteoarthritis in knee X-ray images

Study of segmentation techniques for assessment of osteoarthritis in knee X-ray images

Shivanand S. Gornale, Pooja U. Patravali, Archana M. Uppin, Prakash S. Hiremath

Статья научная

Arthritis is one of the chronic joint disorders that have affected many lives including middle age and older age group. Arthritis exists in many forms and one among them is Osteoarthritis. Osteoarthritis affects the bigger joints like knee, hip, spine, feet etc. Early detection of Osteoarthritis is most essential if not treated properly may result in deformity. The researchers have become more concerned to detect the disorder in the early stage by merging their medical knowledge with machine vision approach in an appropriate way. The objective of this work is to study various segmentation techniques for the detection of Osteoarthritis in the early stage. The different segmentation technique like Sobel and Prewitt edge segmentation, Otsu’s method of segmentation and Texture based segmentation are used to carry out the experimentation. The different statistical features are computed, analyzed and classified. The accuracy rate of 91.16% for Sobel method, 96.80% for Otsu’s method, 94.92% for texture method and 97.55% for Prewitt method is obtained. The results are more promising and competitive which are validated by medical experts.

Бесплатно

Study on Diesel Engine Fault Diagnosis Method based on Integration Super Parent One Dependence Estimator

Study on Diesel Engine Fault Diagnosis Method based on Integration Super Parent One Dependence Estimator

Wang Xin, Yu Hongliang, Zhang Lin, Huang Chaoming, Song Yuchao

Статья научная

Under the background of the deficiencies and shortcomings in traditional diesel engine fault diagnostic, the naïve Bayesian classifier method which built on the basis of the probability density function is adopted to diagnose the fault of diesel engine. A new approach is proposed to weight the super-parent one dependence estimators. To verify the validity of the proposed method, the experiments are performed using 16 datasets collected by University of California Irvine (UCI) and 5 diesel engine datasets collected by our lab. The comparison experimental results with other algorithms demonstrate the effectiveness of the proposed method.

Бесплатно

Study on the Hippocampal Neuron's Minimal Models' Discharge Patterns

Study on the Hippocampal Neuron's Minimal Models' Discharge Patterns

Yueping Peng, Haiying Wu, Nan Zou

Статья научная

The hippocampal CA1 pyramid neuron has plenty of discharge actions. The one-compartment model of CA1 pyramid neuron developed by David is a nine-dimension complex dynamic model. In the thesis, the currents related to the nine-dimension complex model are analyzed and classified by the model’s reduction theory and methods based on neurodynamics, and four minimal models are gotten: (INa+IKdr)-minimal model, (INa+IM)-minimal model, (INa+ICa+Iy)-minimal model, and (INa+ICa+IsAHP)-minimal model. These minimal models have plenty of dynamic actions, and under the current’s stimulation, they can all generate regular discharge and have period discharge pattern, bursting pattern, the chaos discharge pattern, and so on. Compared with the initial nine-dimension complex model, these minimal models’ dimension are much reduced, and are more convenient to numerical simulation, calculating, and analyzing. In addition, these minimal models provide a simpler and flexible method to discuss the specific currents’ dynamic characteristics and functions of the initial nine-dimension complex model by the theory of neurodynamics.

Бесплатно

Subspace based Expression Recognition Using Combinational Gabor based Feature Fusion

Subspace based Expression Recognition Using Combinational Gabor based Feature Fusion

G. P. Hegde, M. Seetha

Статья научная

This paper demonstrates mainly on enhancement of extracted feature and proposes a novel approach for feature level fusion for efficient expression recognition. Extracted Gabor filter magnitude feature vector has been fused with upper face part geometrical features and Gabor phase feature vector has been fused with lower face part geometrical features respectively. Both these high dimensional feature dataset have been projected into low dimensional subspace for de-correlating the feature data redundancy by preserving local and global discriminative features of various expression classes of JAFFE, YALE and FD databases. The effectiveness of subspace of fused dataset has been measured with different dimensional parameters of Gabor filter. The experimental results reveal that performance of the subspace approaches for high dimensional proposed feature level fused dataset yields higher accuracy rates compared to state of art approaches.

Бесплатно

Журнал